This paper studies the stochastic linear bandit problem, where a decision-maker chooses actions from possibly time-dependent sets of vectors in ℝd and receives noisy rewards. The objective is to minimize regret, the difference between the cumulative expected reward of the decision-maker and that of an oracle with access to the expected reward of each action, over a sequence of T decisions. Linear Thompson Sampling (LinTS) is a popular Bayesian heuristic, supported by theoretical analysis that shows its Bayesian regret is bounded by Õ(d√T̅), matching minimax lower bounds. However, previous studies demonstrate that the frequentist regret bound for LinTS is Õ(d√d̅T̅), which requires posterior varianced inflation and is by a factor of √d̅ worse than the best optimism-based algorithms. We prove that this inflation is fundamental inflation and is by a factor of and that the frequentist bound of Õ(d√d̅T̅) is the best possible, by demonstrating a randomization bias phenomenon in LinTS that can cause linear regret without inflation. We propose a data-driven version of LinTS that adjusts posterior inflation using observed data, which can achieve minimax optimal frequentist regret, under additional conditions. Our analysis provides new insights into LinTS and settles an open problem in the field.
-
Faculty
- Academic Areas
- Awards & Honors
- Seminars
-
Conferences
- Accounting Summer Camp
- California Econometrics Conference
- California Quantitative Marketing PhD Conference
- California School Conference
- China India Insights Conference
- Homo economicus, Evolving
-
Initiative on Business and Environmental Sustainability
- Political Economics (2023–24)
- Scaling Geologic Storage of CO2 (2023–24)
- A Resilient Pacific: Building Connections, Envisioning Solutions
- Adaptation and Innovation
- Changing Climate
- Civil Society
- Climate Impact Summit
- Climate Science
- Corporate Carbon Disclosures
- Earth’s Seafloor
- Environmental Justice
- Finance
- Marketing
- Operations and Information Technology
- Organizations
- Sustainability Reporting and Control
- Taking the Pulse of the Planet
- Urban Infrastructure
- Watershed Restoration
- Junior Faculty Workshop on Financial Regulation and Banking
- Ken Singleton Celebration
- Marketing Camp
- Quantitative Marketing PhD Alumni Conference
- Rising Scholars Conference
- Theory and Inference in Accounting Research
- Voices
- Publications
- Books
- Working Papers
- Case Studies
-
Research Labs & Initiatives
- Cities, Housing & Society Lab
- Corporate Governance Research Initiative
- Corporations and Society Initiative
- Golub Capital Social Impact Lab
- Policy and Innovation Initiative
- Rapid Decarbonization Initiative
- Stanford Latino Entrepreneurship Initiative
- Value Chain Innovation Initiative
- Venture Capital Initiative
- Behavioral Lab
- Data, Analytics & Research Computing