Management Science. April
2018, Vol. 64, Issue 4, Pages 1727-1746
Motivated by the proliferation of online platforms that collect and disseminate consumers’ experiences with alternative substitutable products/services, we investigate the problem of optimal information provision when the goal is to maximize aggregate consumer surplus. We develop a decentralized multiarmed bandit framework where a forward-looking principal (the platform designer) commits up front to a policy that dynamically discloses information regarding the history of outcomes to a series of short-lived rational agents (the consumers). We demonstrate that consumer surplus is nonmonotone in the accuracy of the designer’s information-provision policy. Because consumers are constantly in “exploitation” mode, policies that disclose accurate information on past outcomes suffer from inadequate “exploration.” We illustrate how the designer can (partially) alleviate this inefficiency by employing a policy that strategically obfuscates the information in the platform’s possession; interestingly, such a policy is beneficial despite the fact that consumers are aware of both the designer’s objective and the precise way by which information is being disclosed to them. More generally, we show that the optimal information-provision policy can be obtained as the solution of a large-scale linear program. Noting that such a solution is typically intractable, we use our structural findings to design an intuitive heuristic that underscores the value of information obfuscation in decentralized learning. We further highlight that obfuscation remains beneficial even if the designer can directly incentivize consumers to explore through monetary payments.