Contextual bandit algorithms are sensitive to the estimation method of the outcome model as well as the exploration method used, particularly in the presence of rich heterogeneity or complex outcome models, which can lead to difficult estimation problems along the path of learning. We study a consideration for the exploration vs. exploitation framework that does not arise in multi-armed bandits but is crucial in contextual bandits; the way exploration and exploitation is conducted in the present affects the bias and variance in the potential outcome model estimation in subsequent stages of learning. We develop parametric and non-parametric contextual bandits that integrate balancing methods from the causal inference literature in their estimation to make it less prone to problems of estimation bias. We provide the first regret bound analyses for contextual bandits with balancing in the domain of linear contextual bandits that match the state of the art regret bounds. We demonstrate the strong practical advantage of balanced contextual bandits on a large number of supervised learning datasets and on a synthetic example that simulates model misspecification and prejudice in the initial training data. Additionally, we develop contextual bandits with simpler assignment policies by leveraging sparse model estimation methods from the econometrics literature and demonstrate empirically that in the early stages they can improve the rate of learning and decrease regret.
-
Faculty
- Academic Areas
- Awards & Honors
- Seminars
-
Conferences
- Accounting Summer Camp
- California Econometrics Conference
- California Quantitative Marketing PhD Conference
- California School Conference
- China India Insights Conference
- Homo economicus, Evolving
-
Initiative on Business and Environmental Sustainability
- Political Economics (2023–24)
- Scaling Geologic Storage of CO2 (2023–24)
- A Resilient Pacific: Building Connections, Envisioning Solutions
- Adaptation and Innovation
- Changing Climate
- Civil Society
- Climate Impact Summit
- Climate Science
- Corporate Carbon Disclosures
- Earth’s Seafloor
- Environmental Justice
- Finance
- Marketing
- Operations and Information Technology
- Organizations
- Sustainability Reporting and Control
- Taking the Pulse of the Planet
- Urban Infrastructure
- Watershed Restoration
- Junior Faculty Workshop on Financial Regulation and Banking
- Ken Singleton Celebration
- Marketing Camp
- Quantitative Marketing PhD Alumni Conference
- Rising Scholars Conference
- Theory and Inference in Accounting Research
- Voices
- Publications
- Books
- Working Papers
- Case Studies
- Postdoctoral Scholars
-
Research Labs & Initiatives
- Cities, Housing & Society Lab
- Corporate Governance Research Initiative
- Corporations and Society Initiative
- Golub Capital Social Impact Lab
- Initiative for Financial Decision-Making
- Policy and Innovation Initiative
- Rapid Decarbonization Initiative
- Stanford Latino Entrepreneurship Initiative
- Value Chain Innovation Initiative
- Venture Capital Initiative
- Behavioral Lab
- Data, Analytics & Research Computing