Survey Bandits with Regret Guarantees

Survey Bandits with Regret Guarantees

By Sanath Kumar Krishnamurthy, Susan Athey
February 23,2020Working Paper No. 3902

We consider a variant of the contextual bandit problem. In standard contextual bandits, when a user arrives we get the user’s complete feature vector and then assign a treatment (arm) to that user. In a number of applications (like health care), collecting features from users can be costly. To address this issue, we propose algorithms that avoid needless feature collection while maintaining strong regret guarantees.