Bandit Experiment Application

The aim of this app is to assist researchers that plan adaptive experiments.

Adaptive experiments vary the proportion of observations that are assigned to each treatment arm over the course of the experiment. This method is used to make experiments more effective by assigning better-performing treatments to more participants, so that the researcher learns which treatment is most effective faster and using less resources. Hopefully, the experiment identifies an arm with a higher value at the end of the experiment. Adaptive experiments can be used in pilot experiments to narrow down the number of potential treatments before deciding on what treatment arms to include in a larger non-adaptive experiment.


1. General Configuration

Select the experiment length and the number of simulations:

  • Experiment length: the length of time of the experiment
  • Number of simulations: how many times the experiment was run

2. Arm Configuration

Select the number of treatment arms and hypothesized outcomes for each arm:

  • Average success rate (between 0 and 1): binary outcome variable
  • Must be at least two arms

3. Algorithm Configuration

  • Collection: algorithm that determines how treatment assignment probabilities change over time
    • If the main objective is to maximize outcomes during the experiment, it would be best to use a UCB or Thompson Sampling algorithm.
    • If the main objective is to find a good treatment at the end of the experiment, it would be best to use an Exploration Sampling or Epsilon-Greedy algorithm.
  • Decision: selection rule applied at the end of the experiment to choose a treatment arm
  • Floor decay: the exponent ‘alpha’ on the assignment probability lower bound (1/K)*t^(-alpha)
    • A higher number means faster decay and more aggressive adaptivity.
  • Unif. fraction: the fraction of observations assigned to non-adaptive treatment assignment
  • # Batches: the number of times the assignment probability is changed over the course of the experiment