Match!

Does Regression Produce Representative Estimates of Causal Effects

Published on Jan 1, 2016in American Journal of Political Science4.354
· DOI :10.1111/ajps.12185
M AronowPeter16
Estimated H-index: 16
(Yale University),
Cyrus Samii18
Estimated H-index: 18
(NYU: New York University)
Sources
Abstract
With an unrepresentative sample, the estimate of a causal effect may fail to characterize how effects operate in the population of interest. What is less well understood is that conventional estimation practices for observational studies may produce the same problem even with a representative sample. Causal effects estimated via multiple regression differentially weight each unit's contribution. The “effective sample” that regression uses to generate the estimate may bear little resemblance to the population of interest, and the results may be nonrepresentative in a manner similar to what quasi-experimental methods or experiments with convenience samples produce. There is no general external validity basis for preferring multiple regression on representative samples over quasi-experimental or experimental methods. We show how to estimate the “multiple regression weights” that allow one to study the effective sample. We discuss alternative approaches that, under certain conditions, recover representative average causal effects. The requisite conditions cannot always be met.
  • References (53)
  • Citations (32)
📖 Papers frequently viewed together
5,288 Citations
2001
22.3k Citations
195 Citations
78% of Scinapse members use related papers. After signing in, all features are FREE.
References53
Newest
#1Alberto Abadie (NBER: National Bureau of Economic Research)H-Index: 30
#2Susan Athey (Stanford University)H-Index: 40
Last. Jeffrey M. Wooldridge (MSU: Michigan State University)H-Index: 44
view all 4 authors...
When a researcher estimates the parameters of a regression function using information on all 50 states in the United States, or information on all visits to a website, what is the interpretation of the standard errors? Researchers typically report standard errors that are designed to capture sampling variation, based on viewing the data as a random sample drawn from a large population of interest, even in applications where it is difficult to articulate what that population of interest is and ho...
33 Citations
Freedman [Adv. in Appl. Math. 40 (2008) 180-193; Ann. Appl. Stat. 2 (2008) 176-196] critiqued ordinary least squares regression adjustment of estimated treatment effects in randomized experiments, using Neyman's model for randomization inference. Contrary to conventional wisdom, he argued that adjustment can lead to worsened asymptotic precision, invalid measures of precision, and small-sample bias. This paper shows that in sufficiently large samples, those problems are either minor or easily fi...
157 CitationsSource
#1M AronowPeter (Yale University)H-Index: 16
#2Joel A. MiddletonH-Index: 8
We derive a class of design-based estimators for the average treatment effect that are unbiased whenever the treatment assignment process is known. We generalize these estimators to include unbiased covariate adjustment using any model for outcomes that the analyst chooses. We then provide expressions and conservative estimators for the variance of the proposed estimators.
35 CitationsSource
#1KyungMann KimH-Index: 1
#2Simon ThompsonH-Index: 1
17k CitationsSource
#1James M. RobinsH-Index: 96
#2Miguel A. HernánH-Index: 76
399 Citations
#1Kosuke Imai (Princeton University)H-Index: 1
#2In Song Kim (Princeton University)H-Index: 1
Fixed eects regression models are the primary workhorse for causal inference in applied panel data analysis. In this paper, we establish that in the case of a binary treatment variable with no pre-treatment covariate, xed eects estimators are algebraically equivalent to particular matching estimators. At the most basic level, the results suggest that xed eects
5 Citations
#1Jeff D. Colgan (AU: American University)H-Index: 14
Oil-exporting states, or petrostates, engage in militarized interstate disputes (MIDS) at a much higher rate on average than non-petrostates. Why is this so? Further, what explains the variation among the petrostates in adopting aggressive foreign policies and engaging in MIDS on that basis? This paper investigates these questions by developing and testing a theory that proposes that when revolutionary governments come to power in petrostates, they have a higher propensity to launch MIDS than co...
69 CitationsSource
#1B MortonRebecca (NYU: New York University)H-Index: 23
#2C WilliamsKenneth (MSU: Michigan State University)H-Index: 7
Part I. Introduction: 1. The advent of experimental political science Part II. Experimental Reasoning about Causality: 2. Experiments and causal relations 3. The causal inference problem and the Rubin causal model 4. Controlling observables and unobservables 5. Randomization and pseudo-randomization 6. Formal theory and causality Part III. What Makes a Good Experiment?: 7. Validity and experimental manipulations 8. Location, artificiality, and related design issues 9. Choosing subjects 10. Subje...
175 Citations
#1F. Daniel Hidalgo (University of California, Berkeley)H-Index: 9
#2Suresh Naidu (Columbia University)H-Index: 16
Last. Neal Richardson (University of California, Berkeley)H-Index: 3
view all 4 authors...
This study estimates the effect of economic conditions on redistributive conflict. We examine land invasions in Brazil using a panel data set with over 50,000 municipality-year observations. Adverse economic shocks, instrumented by rainfall, cause the rural poor to invade and occupy large landholdings. This effect exhibits substantial heterogeneity by land inequality and land tenure systems, but not by other observable variables. In highly unequal municipalities, negative income shocks cause twi...
112 CitationsSource
Two recent papers, Deaton (2009), and Heckman and Urzua (2009), argue against what they see as an excessive and inappropriate use of experimental and quasi-experimental methods in empirical work in economics in the last decade. They specifically question the increased use of instrumental variables and natural experiments in labor economics, and of randomized experiments in development economics. In these comments I will make the case that this move towards shoring up the internal validity of est...
248 CitationsSource
Cited By32
Newest
#1Daniel Masterson (Stanford University)
#2M. Christian Lehmann (UnB: University of Brasília)H-Index: 1
This article examines whether refugees are prime candidates for recruitment into armed groups and whether humanitarian aid to refugees impacts their choice to join armed groups. First, our original...
Source
Why have decades of high and rising inequality in the United States not increased public support for redistribution? An established theory in political science holds that Americans’ distrust of government decreases their support for redistribution, but empirical support draws primarily on regression analyses of national surveys. I discuss the untestable assumptions required for identification with regression modeling and propose an alternative design that uses randomized experiments about politi...
Source
"All models are wrong, but some are useful", wrote George E. P. Box (1979). Machine learning has focused on the usefulness of probability models for prediction in social systems, but is only now coming to grips with the ways in which these models are wrong---and the consequences of those shortcomings. This paper attempts a comprehensive, structured overview of the specific conceptual, procedural, and statistical limitations of models in machine learning when applied to society. Machine learning ...
#1Adam M. Dynes (BYU: Brigham Young University)H-Index: 4
#2John B. Holbein (UVA: University of Virginia)H-Index: 4
Retrospective voting is vital for democracy. But, are the objective performance metrics widely thought to be relevant for retrospection—such as the performance of the economy, criminal justice system, and schools, to name a few—valid criteria for evaluating government performance? That is, do political coalitions actually have the power to influence the performance metrics used for retrospection on the timeline introduced by elections? Using difference-in-difference and regression discontinuity ...
1 CitationsSource
#1CoppedgeMichaelH-Index: 15
#2John GerringH-Index: 33
Last. David AltmanH-Index: 14
view all 23 authors...
#1Ryan C. Briggs (U of G: University of Guelph)H-Index: 5
Foreign aid is thought to be useful, and therefore desirable, to recipient governments because it allows them to increase their support through the provision of goods or services. However, the effe...
Source
#1Graeme Blair (UCLA: University of California, Los Angeles)H-Index: 5
#2Jasper Cooper (UCSD: University of California, San Diego)H-Index: 1
Last. Macartan Humphreys (Columbia University)H-Index: 24
view all 4 authors...
Author(s): BLAIR, GRAEME; COOPER, JASPER; COPPOCK, ALEXANDER; HUMPHREYS, MACARTAN | Abstract: Researchers need to select high-quality research designs and communicate those designs clearly to readers. Both tasks are difficult. We provide a framework for formally “declaring” the analytically relevant features of a research design in a demonstrably complete manner, with applications to qualitative, quantitative, and mixed methods research. The approach to design declaration we describe requires de...
3 CitationsSource
#1Richard J. Smith (WashU: Washington University in St. Louis)H-Index: 37
Source
#1Kosuke Imai (Harvard University)H-Index: 37
#2In Song Kim (MIT: Massachusetts Institute of Technology)H-Index: 4
4 CitationsSource
#1Ashesh Rambachan (Harvard University)H-Index: 1
#2Neil Shephard (Harvard University)H-Index: 59
This paper uses potential outcome time series to provide a nonparametric framework for quantifying dynamic causal effects in macroeconometrics. This provides sufficient conditions for the nonparametric identification of dynamic causal effects as well as clarify the causal content of several common assumptions and methods in macroeconomics. Our key identifying assumption is shown to be non-anticipating treatments which enables nonparametric inference on dynamic causal effects. Next, we provide a ...
1 Citations