A/B testing has been around for a century. And it’s become so common that many of us don’t even notice it. Whenever you pick up your phone, you’re likely becoming part of an A/B test, as sites and apps try to figure out what will get you to click, scroll, or download.
However, today’s complex online platforms have revealed the limitations of A/B testing.
In “two-sided” marketplaces for ridesharing and home rentals, running an experiment on one group of buyers simultaneously affects what’s available for all other buyers, which can affect the entire market. This cascade of interference means the experimental group can no longer be isolated for study. So researchers are tweaking and expanding the boundaries of A/B testing to try and keep up with platforms that mediate interactions between multiple users with different goals.
Stanford Graduate School of Business professors have been at the forefront of upgrading A/B testing for the digital age. Guido Imbens and Gabriel Weintraub independently yet simultaneously experimented with similar ways to run interference-free tests on two-sided platforms. Mohsen Bayati has worked with “multi-armed bandits” — dynamic experiments that change as it becomes clear which option works best.
This is a moment of rapid innovation for a model that permeates our lives. This short explainer video breaks it down.
Full Transcript
Note: Transcripts are generated by machine and lightly edited by humans. They may contain errors.
While we scroll and click, designers, engineers and marketers are running experiments to figure out what will grab our attention. They do this with A/B testing.
Let’s say a shoe company wants to know which email subject line will generate more sales, one with a sneaker emoji or one without. They’ll divide the recipients in two, sending group A the emoji while group B gets the basic version. Then they’ll compare the two groups’ responses to determine which message gets more clicks and sells more kicks. For decades, A/B testing has been the standard for evaluating everything from fertilizers to pharmaceuticals, but its simple elegance can’t keep up with increasingly complex online platforms.
A vacation rental app, for example, has two kinds of users: property owners and travelers. Running an experiment on one group will likely change how the other group behaves, and that’s a big no-no in A/B testing. So researchers are upgrading A/B testing with new kinds of experiments that have the potential to improve people’s lives and not just online.
For media inquiries, visit the Newsroom.