In the early months of 2020, as COVID-19 shot across the U.S. and a stunned populace struggled to navigate the many unknowns, Kuang Xu recognized that the moment called for his particular expertise. As a researcher focused on stochastic modeling — mathematical descriptions of systems steeped in uncertainty — Xu isn’t intimidated by big unknowns.
“In what urgent situation do you call a professor of operations, information, and technology with expertise in stochastic modeling?” says Xu, an associate professor at Stanford Graduate School of Business. “When you have to make a lot of decisions in a highly uncertain environment — and you don’t have a lot of data to work with.”
In other words, you call Xu in a situation precisely like the first days of the pandemic, when even the most basic understandings of the virus and its transmission were frighteningly provisional.
Xu reached out to longtime collaborator Carri Chan, a professor at Columbia Business School, who was observing operations within the New York-Presbyterian Hospital system. Chan had been listening in the hospital leadership’s “state of the system” call every morning, and in March 2020, she witnessed the chaos erupting within New York City’s hospitals. Citywide, more than 3,000 patients had been hospitalized with the virus and nearly 300 had died.
Teaming up with Chan, Xu, who was born in Suzhou, China, combed through Chinese research papers, scanning for insights emerging from the country where the coronavirus had first been identified. Xu and Chan learned, for example, that the majority of COVID transmission was happening within households and that a centralized quarantine policy in Wuhan had been found to reduce infection rates by 75%.
Imagining how such a policy might be adapted to the U.S. led Xu and Chan to consider the possible public health benefits of “quarantine hotels,” where symptomatic or recovering COVID patients could voluntarily isolate to avoid infecting others. Their mathematical modeling, summarized in a Business Insider op-ed they published in April 2020 with Jing Dong, an assistant professor at Columbia Business School, suggested that if even 5% to 10% of those infected with the virus could quarantine in this way, infection numbers would fall significantly.
Xu, Chan, and Dong presented their findings to New York-Presbyterian Hospital’s leadership, and, two weeks later, New York City announced its COVID-19 Hotel Room Isolation Program — a heartening development, though the researchers may never know whether their work had any direct bearing on it.
Chan believes that Xu’s eloquence in communicating about technical subjects was especially useful during this time. “Kuang is able to distill his work down to really clean, core, crisp insights,” she says. “I think for the work that we did with COVID, that was really important.”
Xu believes the greatest challenge in his line of research is striking the right balance between “intellectual depth, conceptual clarity, and practical relevance.” The accessibility of his work is paramount to him, in large part because he believes we’re entering an increasingly information centric age, and the big questions at the heart of his research are about how to make the best use of the information available to us.
Thorny Theoretical Questions
When he was in high school in China, Xu enjoyed physics and, he now acknowledges with a laugh, struggled with math. He fretted about how he’d perform on the annual gaokao, China’s highly competitive college entrance exam. His parents suggested that he apply to universities in the U.S., Canada, and other countries outside China, and offered their savings to help fund his education. Xu enrolled at the University of Illinois at Urbana-Champaign.
When it came to deciding on a major, Xu’s highest priority was practicality — he felt it was important to choose something that would translate into a decent job. He ruled out physics as too impractical and settled on what seemed like the next-best option: electrical engineering.
Xu was drawn to more theoretical subdisciplines, including stochastics. “I became more idealistic,” he remembers. “I had to turn down what seemed at the time like very lucrative offers from proprietary trading firms in Chicago and New York. But I really liked research, and the allure of digging deeper just seemed so magical.”
Now, as a researcher whose focus is on the thorny theoretical questions that ensnared him years ago, Xu continues to dig deeper. His choice of information as a central research subject springs from his assessment of how its role has evolved over the decades, especially with the rise of the internet, and how it’s likely to shape the future.
In Xu’s usage, information is not quite synonymous with data. Information, he explains, “is the data that matters.” It must be actionable, affecting someone’s decision-making. Among the big questions he explores: What is the value of partial information? How can experiments relying on incomplete information be improved? The expanding role of information stands to dramatically change how healthcare, insurance, manufacturing, and many other sectors operate — and also has urgent implications for the policies regulating them.
Throughout most of economic history, Xu explains, information has been one of many resources used in the creation of goods and services. In a 20th-century manufacturing system, for example, producers needed basic information about the demand for their products, but arguably more important were resources like oil, paint, and labor.
It wasn’t until the invention of the computer, followed by early versions of the internet, that information began to emerge as an increasingly important part of economic production. “Towards the late 20th century and early 21st century, information started to play a much more important role,” Xu says. “For instance, you could adapt your supply chain much more rapidly depending on the products being asked for.”
The trend toward information as not just an economic input but as the centrally important input will only continue, Xu says. He believes it will produce an economic landscape where “what you know is more important than what you do.”
He offers examples of what this could look like and how it’s already starting to take shape. Personalized medicine will be based on a complex process of synthesizing sensitive information about an individual’s genome, and quickly manufacturing necessary medications. Highly personalized insurance — which is already available in its early stages — could take the place of more standardized financial products.
“What if, in the future, production activities are more commoditized, but the information they’re based on is precious?” asks Xu. “That’s the high-level question I have.”
The Information Lifecycle
If information is increasingly coveted, Xu argues, then it deserves to be studied more closely. For him, this has meant thinking in terms of an informational life span with three distinct stages: generation, utilization, and protection. He designs his research projects with this timeline in mind, and each of his studies touch on one or more these stages.
Xu’s recent projects — and the ones he says he’s most excited about — zero in on the generation stage, seeking better answers to how companies and researchers can determine what information is most valuable to them. He is working with collaborators to design better reinforcement learning algorithms, which stand to vastly improve, for example, the kinds of recommendation systems that power sites like Netflix and Spotify. Such algorithms could also be imperative to the future of personalized medicine — balancing the long-term need to learn which drugs work in individual patients against the short-term need to be both safe and functional.
Another major research focus for Xu is his use of stochastic modeling to improve causal inference in complex, dynamic environments. It undergirds a common type of experimentation among e- commerce companies who are, say, using A/B testing to determine which promotional emails spur more people to click through and make a purchase. But, as Xu explains, causal inference techniques can run into problems when applying a treatment to one person affects the outcome of another. This can create difficulties for ride-share platforms, for example, who want to better understand the impacts of giving drivers targeted payment boosts when demand is spiking. Do the boosts lead to a revenue increase for the platform, or just to higher costs? In these problems, the dynamics are often too complex to be tackled using traditional approaches, and stochastic modeling offers a rich set of tools for capturing the effect of these interactions, ultimately leading to more efficient experiments and algorithms. Xu has worked with companies like Uber and Shipt to help them untangle these problems by designing better, more-efficient experiments with an eye toward improving their results’ accuracy and reducing their cost.
Xu has also spent significant time on big questions surrounding the final stage of the informational life cycle: protection. This strain of his research goes beyond the pervasive conversations about the extent of data collection, and dives into subtler questions about what Xu calls “action-driven privacy”: Can a person’s actions on the internet betray information about their motivations in ways that lead to privacy breaches?
One of his studies looked at genetic data protection and another dug into the type of sequential process a person might follow when considering an online purchase. In these kinds of situations, Xu explains, “a firm or individual can act as a potential eavesdropper and actually monitor actions to infer underlying motivations or secret information.” For example, a prospective shoe buyer might click on dozens of different options on an retail platform. The site can use machine learning to predict the shopper’s final purchase based on their first few clicks — “and this gives the platform tremendous power to do promotion or even to hike up prices.”
Xu and his collaborators sought to determine how much time it would cost such a shopper to disguise their true intention by throwing fake clicks into the informational trail they left as they shopped. He argues that the answer to this question has moral and policy implications. After all, if it’s easy for a consumer to cover their online tracks, then it’s easier for companies to abdicate their responsibility to build in privacy protections. However, if escaping this type of predictive or discriminatory pricing is difficult and time-consuming, arguments for laws restricting it have more weight.
“How do we settle this debate? Simple. Look at it mathematically: What is possible to do?” In the end, he found that the online shopper would have to replicate their activity five times to throw an eavesdropper off their trail. “Do you think it’s reasonable for a consumer to replicate their effort five times, just so that you can’t apply discriminatory pricing? Of course not. It’s just impossible.”
Xu sees great beauty and promise in the potential of mathematical models to cut through what had seemed to be blinding uncertainty and point toward sound decisions. Ultimately, Xu believes that as a researcher in the field of stochastic modeling, he has developed a glass-half-full view of uncertainty, while most of the world takes a glass-half-empty perspective. Rather than see big, uncertain questions as paralyzing, he says, it’s possible for us to relate to confusion as partial certainty. “Don’t focus on what you don’t know,” he says. “Focus on the fact that you know a little.”
Photos by Elena Zhukova