Browse or search publications from faculty affiliated with the lab.
Association of α1-Blocker Receipt With 30-Day Mortality and Risk of Intensive Care Unit Admission Among Adults Hospitalized With Influenza or Pneumonia in Denmark
Alpha 1–adrenergic receptor blocking agents (α1-blockers) have been reported to have protective benefits against hyperinflammation and cytokine storm syndrome, conditions that are associated with mortality in patients with coronavirus disease…
Policy Learning with Observational Data
In many areas, practitioners seek to use observational data to learn a treatment assignment policy that satisfies application-specific constraints, such as budget, fairness, simplicity, or other functional form constraints. For example,…
Local Linear Forests
Random forests are a powerful method for non-parametric regression, but are limited in their ability to fit smooth signals. Taking the perspective of random forests as an adaptive kernel method, we pair the forest kernel with a local linear…
Generic Drug Repurposing for Public Health and National Security: COVID-19 and Beyond
The novel disease caused by the SARS-CoV-2 virus (COVID-19) has been a shock to both our health and wealth, with more than 276,000 dead in the U.S. and economic disruption that some have estimated as high as more than $16 trillion. These…
A How-To Guide for Conducting Retrospective Analyses: Example COVID-19 Study
In the urgent setting of the COVID-19 pandemic, treatment hypotheses abound, each of which requires careful evaluation. A randomized controlled trial generally provides the strongest possible evaluation of a treatment, but the efficiency and…
Alpha-1 Adrenergic Receptor Antagonists for Preventing Acute Respiratory Distress Syndrome and Death from Cytokine Storm Syndrome
In severe viral pneumonia, including Coronavirus disease 2019 (COVID-19), the viral replication phase is often followed by hyperinflammation (‘cytokine storm syndrome’), which can lead to acute respiratory distress syndrome, multi-organ failure,…
Combining Experimental and Observational Data to Estimate Treatment Effects on Long Term Outcomes
There has been an increase in interest in experimental evaluations to estimate causal effects, partly because their internal validity tends to be high. At the same time, as part of the big data revolution, large, detailed, and representative,…
policytree: Policy Learning via Doubly Robust Empirical Welfare Maximization over Trees
The problem of learning treatment assignment policies from randomized or observational data arises in many fields. For example, in personalized medicine, we seek to map patient observables (like age, gender, heart pressure, etc.) to a treatment…
The Allocation of Decision Authority to Human and Artificial Intelligence
The allocation of decision authority by a principal to either a human agent or an artificial intelligence is examined. The principal trades off an AI’s more aligned choice with the need to motivate the human agent to expend effort in learning…
Stable Prediction with Model Misspecification and Agnostic Distribution Shift
For many machine learning algorithms, two main assumptions are required to guarantee performance. One is that the test data are drawn from the same distribution as the training data, and the other is that the model is correctly specified. In real…
SHOPPER: A Probabilistic Model of Consumer Choice with Substitutes and Complements
We develop SHOPPER, a sequential probabilistic model of shopping data. SHOPPER uses interpretable components to model the forces that drive how a customer chooses products; in particular, we designed SHOPPER to capture how items interact with…
Sampling-based vs. Design-based Uncertainty in Regression Analysis
Consider a researcher estimating the parameters of a regression function based on data for all 50 states in the United States or on data for all visits to a website. What is the interpretation of the estimated parameters and the standard errors?…
Economists (and Economics) in Tech Companies
As technology platforms have created new markets and new ways of acquiring information, economists have come to play an increasingly central role in tech companies-tackling problems such as platform design, strategy, pricing, and policy. Over the…
Sufficient Representations for Categorical Variables
Many learning algorithms require categorical data to be transformed into real vectors before it can be used as input. Often, categorical variables are encoded as one-hot (or dummy) vectors. However, this mode of representation can be…
Balanced Linear Contextual Bandits
Contextual bandit algorithms are sensitive to the estimation method of the outcome model as well as the exploration method used, particularly in the presence of rich heterogeneity or complex outcome models, which can lead to difficult estimation…
Synthetic Difference in Differences
We present a new perspective on the Synthetic Control (SC) method as a weighted least squares regression estimator with time fixed effects and unit weights. This perspective suggests a generalization with two way (both unit and time) fixed…
Generalized Random Forests
We propose generalized random forests, a method for nonparametric statistical estimation based on random forests (Breiman [Mach. Learn. 45(2001) 5–32]) that can be used to fit any quantity of interest identified as…
Estimation Considerations in Contextual Bandits
Contextual bandit algorithms are sensitive to the estimation method of the outcome model as well as the exploration method used, particularly in the presence of rich heterogeneity or complex outcome models, which can lead to difficult…
Offline Multi-Action Policy Learning: Generalization and Optimization
In many settings, a decision-maker wishes to learn a rule, or policy, that maps from observable characteristics of an individual to an action. Examples include selecting offers, prices, advertisements, or emails to send to consumers, as well as…
Estimating Heterogeneous Consumer Preferences for Restaurants and Travel Time Using Mobile Location Data
We estimate a model of consumer choices over restaurants using data from several thousand anonymous mobile phone users. Restaurants have latent characteristics (whose distribution may depend on restaurant observables) that affect consumers’ mean…
Approximate Residual Balancing: Debiased Inference of Average Treatment Effects in High Dimensions
There are many settings where researchers are interested in estimating average treatment effects and are willing to rely on the unconfoundedness assumption, which requires that the treatment assignment be as good as random conditional on…
Stable Predictions across Unknown Environments
In many important machine learning applications, the training distribution used to learn a probabilistic classifier differs from the testing distribution on which the classifier will be used to make predictions. Traditional methods correct the…
Exact P-values for Network Interference
We study the calculation of exact p-values for a large class of non-sharp null hypotheses about treatment effects in a setting with data from experiments involving members of a single connected network. The class includes null hypotheses that…
Sampling-Based vs. Design-Based Uncertainty in Regression Analysis
Previously titled: Finite Population Causal Standard Errors
Consider a researcher estimating the parameters of a regression function based on data for all 50 states in the United States or on data for all visits to a website. What…