We use the lens of weak signal asymptotics to study a class of sequentially randomized experiments, including those that arise in solving multiarmed bandit problems. In an experiment with n time steps, we let the mean reward gaps between actions scale to the order 1/√𝑛 to preserve the difficulty of the learning task as n grows. In this regime, we show that the sample paths of a class of sequentially randomized experiments — adapted to this scaling regime and with arm selection probabilities that vary continuously with state — converge weakly to a diffusion limit, given as the solution to a stochastic differential equation. The diffusion limit enables us to derive refined, instance-specific characterization of stochastic dynamics and to obtain several insights on the regret and belief evolution of a number of sequential experiments including Thompson sampling (but not upper-confidence bound, which does not satisfy our continuity assumption). We show that all sequential experiments whose randomization probabilities have a Lipschitz-continuous dependence on the observed data suffer from suboptimal regret performance when the reward gaps are relatively large. Conversely, we find that a version of Thompson sampling with an asymptotically uninformative prior variance achieves near-optimal instance-specific regret scaling, including with large reward gaps, but these good regret properties come at the cost of highly unstable posterior beliefs.
-
Faculty
- Academic Areas
- Awards & Honors
- Seminars
-
Conferences
- Accounting Summer Camp
- California Econometrics Conference
- California Quantitative Marketing PhD Conference
- California School Conference
- China India Insights Conference
- Homo economicus, Evolving
-
Initiative on Business and Environmental Sustainability
- Political Economics (2023–24)
- Scaling Geologic Storage of CO2 (2023–24)
- A Resilient Pacific: Building Connections, Envisioning Solutions
- Adaptation and Innovation
- Changing Climate
- Civil Society
- Climate Impact Summit
- Climate Science
- Corporate Carbon Disclosures
- Earth’s Seafloor
- Environmental Justice
- Finance
- Marketing
- Operations and Information Technology
- Organizations
- Sustainability Reporting and Control
- Taking the Pulse of the Planet
- Urban Infrastructure
- Watershed Restoration
- Junior Faculty Workshop on Financial Regulation and Banking
- Ken Singleton Celebration
- Marketing Camp
- Quantitative Marketing PhD Alumni Conference
- Rising Scholars Conference
- Theory and Inference in Accounting Research
- Voices
- Publications
- Books
- Working Papers
- Case Studies
-
Research Labs & Initiatives
- Cities, Housing & Society Lab
- Corporate Governance Research Initiative
- Corporations and Society Initiative
- Golub Capital Social Impact Lab
- Policy and Innovation Initiative
- Rapid Decarbonization Initiative
- Stanford Latino Entrepreneurship Initiative
- Value Chain Innovation Initiative
- Venture Capital Initiative
- Behavioral Lab
- Data, Analytics & Research Computing