Two-sided marketplace platforms often run experiments (or A/B tests) to test the effect of an intervention before launching it platform-wide. A typical approach is to randomize users into a treatment group, which receives the intervention, and a control group, which does not. The platform then compares the performance in the two groups to estimate the effect if the intervention were launched to everyone. We focus on two common experiment types, where the platform randomizes users either on the supply side or on the demand side. For these experiments, it is known that the resulting estimates of the treatment effect are typically biased: individuals in the market compete with each other, which creates interference and leads to a biased estimate. Here, we observe that economic interactions (competition among demand and supply) lead to statistical phenomenon (biased estimates).
We develop a simple, tractable market model to study bias and variance in these experiments with interference. We focus on two choices available to the platform: (1) Which side of the platform should it randomize on (supply or demand)? (2) What proportion of individuals should be allocated to treatment? We find that both choices affect the bias and variance of the resulting estimators, but in different ways. The bias-optimal choice of experiment type depends on the relative amounts of supply and demand in the market, and we discuss how a platform can use market data to select the experiment type. Importantly, we find that in many circumstances choosing the bias-optimal experiment type has little effect on variance, and in some cases coincide with the variance-optimal type. On the other hand, we find that the choice of treatment proportion can induce a bias-variance tradeoff, where the bias-minimizing proportion increases variance. We discuss how a platform can navigate this tradeoff and best choose the proportion, using a combination of modeling as well as contextual knowledge about the market, the risk of the intervention, and reasonable effect sizes of the intervention.