|
| 1 | +# Paper Review: Using Synthetic Controls: Feasibility, Data Requirements, and Methodological Aspects |
| 2 | + |
| 3 | +**Authors:** Alberto Abadie |
| 4 | +**Citation:** Abadie, A. (2021). "Using Synthetic Controls: Feasibility, Data Requirements, and Methodological Aspects." *Journal of Economic Literature*, 59(2), 391–425. |
| 5 | +**PDF reviewed:** https://doi.org/10.1257/jel.20191450 (published JEL version) |
| 6 | +**Review date:** 2026-05-29 |
| 7 | + |
| 8 | +> Scope note: this is a **practical-guide / review article**. It recaps the synthetic-control estimator (attributed to Abadie & Gardeazabal 2003 and ADH 2010/2015) and contributes a synthesis on **feasibility, data requirements, contextual requirements, and inference**, plus a survey of extensions. Where it surveys other methods (Chernozhukov-Wüthrich-Zhu conformal inference; Arkhangelsky et al. synthetic DiD; Abadie-L'Hour / Ben-Michael et al. penalized & bias-corrected SC; Doudchenko-Imbens; Athey et al. matrix completion), those are **citations** — captured here only as Abadie frames them. The dedicated CWZ 2021 review is authoritative for conformal inference; the others are out of scope for this initiative. Nothing here is sourced from outside this paper. |
| 9 | +
|
| 10 | +--- |
| 11 | + |
| 12 | +## Methodology Registry Entry |
| 13 | + |
| 14 | +*Formatted to match docs/methodology/REGISTRY.md. This is the richest source for the `## SyntheticControl` **assumption / warning** and **edge-case** sections.* |
| 15 | + |
| 16 | +## SyntheticControl |
| 17 | + |
| 18 | +**Primary source (this document):** Abadie, A. (2021). "Using Synthetic Controls…" *JEL*, 59(2), 391–425. https://doi.org/10.1257/jel.20191450 |
| 19 | + |
| 20 | +**Key implementation requirements:** |
| 21 | + |
| 22 | +*Notation (Section 3.1):* |
| 23 | +- `J+1` units, `j=1` treated, donors `j=2,…,J+1`; `T` periods, first `T0` pre-intervention. `Y_jt` observed; `Ŷ^N_jt` synthetic prediction of the untreated potential outcome. `X_1` `(k×1)` treated-unit predictors (may include pre-period outcomes); `X_0` `(k×J)` donor predictors. `Z_j` observed covariates; `μ_j` unobserved factor loadings. |
| 24 | + |
| 25 | +*Target and estimator (Equations 1–3, 7–8):* |
| 26 | + |
| 27 | + (1) τ_{1t} = Y^I_{1t} − Y^N_{1t} (t > T0) |
| 28 | + (2) Ŷ^N_{1t} = Σ_{j=2}^{J+1} w_j · Y_jt |
| 29 | + (3)/(8) τ̂_{1t} = Y_{1t} − Σ_{j=2}^{J+1} w_j*·Y_jt |
| 30 | + |
| 31 | + (7) W* = argmin_W ( Σ_{h=1}^{k} v_h·(X_{h1} − Σ_{j} w_j·X_{hj})^2 )^{1/2} |
| 32 | + s.t. w_j ≥ 0, Σ w_j = 1 ("constrained quadratic optimization") |
| 33 | + |
| 34 | +Footnote 8: assumptions are on `Y^N` only; since `Y_{1t}=Y^I_{1t}` is observed for `t>T0`, **no assumptions on the process generating `Y^I` are needed**. Equation (1) lets the effect vary freely over time. Special cases: equal weights `w_j=1/J` (4), population weights (5), single nearest neighbor `w_m=1` (6). |
| 35 | + |
| 36 | +*The justifying model and the identifying condition (Section 3.3):* |
| 37 | + |
| 38 | + (10) Y^N_{jt} = δ_t + θ_t·Z_j + λ_t·μ_j + ε_jt (linear factor / interactive-FE model) |
| 39 | + |
| 40 | +- **Generalizes DiD/TWFE:** restricting `λ_t = λ` (time-invariant) recovers parallel trends; the factor model relaxes this by letting loadings on `μ_j` vary in time (Bai 2009 cited). |
| 41 | +- **Identifying condition:** if `X_1 = X_0 W*` (the synthetic control reproduces the treated unit's predictors **including pre-period outcomes**), then `τ̂_{1t}` is unbiased under (10). `μ_1` is unobserved and cannot be matched directly; a good pre-period-outcome match approximates it **only when the transitory-shock scale is small or `T0` is large**. A small `T0` with enough shock variation can produce a spurious pre-period match → **overfitting / bias**. |
| 42 | +- **Bias bound (cited to ADH 2010):** bias is bounded by a function **inversely proportional to `T0`**, *provided the pre-period fit is good*. "**A large `T0` cannot drive down the bias if the fit is bad.**" The bound **increases with `J`** (donor-pool size) and with the **number of unobserved factors** (components of `μ_j`). |
| 43 | + |
| 44 | +*Feasibility / convex hull (Sections 3.3, 5):* |
| 45 | +- In practice `X_1 = X_0 W*` is replaced by `X_1 ≈ X_0 W*`; **there are no ex-ante guarantees** on the size of `X_1 − X_0 W*`. When it is large, ADH 2010 recommend **against** using synthetic controls (potential for substantial bias). |
| 46 | +- The treated unit's predictor point `(X_{11},…,X_{k1})` must fall **close to the convex hull** of the donors' points. If the treated unit is **"extreme"** in some predictor (or in pre-period outcomes), no weighted average reproduces it → "the conventional synthetic control estimator should not be used in that case." |
| 47 | +- The simplex constraint **prevents extrapolation** but **not interpolation bias**: averaging away large discrepancies between dissimilar donors biases the estimate → **restrict the donor pool to similar units**. |
| 48 | + |
| 49 | +*`V` (predictor-importance) selection (Section 3.2; this paper formalizes the options):* |
| 50 | +- **(a) Inverse-variance:** set `v_h = 1/Var(X_{h·})` (rescales each predictor row to unit variance). |
| 51 | +- **(b) Nested MSPE minimization (AG 2003 / ADH 2010):** choose `V` so `W(V)` minimizes pre-period outcome MSPE `Σ_{t∈𝒯0} (Y_{1t} − Σ_j w_j(V)·Y_jt)²` over a set `𝒯0 ⊆ {1,…,T0}`. |
| 52 | +- **(c) Out-of-sample cross-validation (ADH 2015), formalized 4-step (Equation 9):** split pre-period into training `1..t0` and validation `t0+1..T0` (concretely `t0 = T0/2`); compute `W̃(V)` on training data; pick `V*` minimizing validation MSPE (9); recompute `W* = W(V*)` using the validation-window predictors. |
| 53 | +- **Footnote 7 (non-uniqueness):** CV weights need not be unique; can add a ridge-type penalty `γ·Σ_h v_h²` (`γ>0`) favoring dense weights. Demonstrate robustness to the `V` choice (Klößner et al. 2018 cited). |
| 54 | + |
| 55 | +*Predictor / variable selection (Section 3.4):* |
| 56 | +- Predictors typically combine **pre-period outcomes** (crucial for matching `μ_j`; arise organically under a VAR DGP) **and** other covariates `Z_j`. Covariates omitted from `Z_j` are "mechanically absorbed into `μ_j`," increasing the bias bound — so **include real covariates**, don't rely on lagged outcomes alone. |
| 57 | +- Flexibility: need not use every pre-period outcome; a **summary** (e.g., a pre-period mean) can suffice when outcomes co-move, and **increases weight sparsity** (number of nonzero `w_j` is controlled by the number of predictors). |
| 58 | +- **Post-intervention outcomes are NOT used** to compute weights → weights are a **design-phase** object (safeguard against specification search / p-hacking; can be pre-registered). |
| 59 | + |
| 60 | +*Standard errors / inference (Sections 3.5, 8):* |
| 61 | +- **No SEs in the classical sense.** Inference is **permutation / placebo-based** (design-based, conditioning on the sample), **not** sampling-based. Rationale: small / single treated unit, no randomization, sample often = population. |
| 62 | +- **RMSPE-ratio permutation test (Equations 11–12):** |
| 63 | + |
| 64 | + (11) R_j(t1,t2) = ( (1/(t2−t1+1)) · Σ_{t=t1}^{t2} (Y_jt − Ŷ^N_jt)^2 )^{1/2} (RMSPE for unit j) |
| 65 | + (12) r_j = R_j(T0+1, T) / R_j(1, T0) (post/pre ratio) |
| 66 | + |
| 67 | + `Ŷ^N_jt` is the synthetic control built treating unit `j` as treated (other `J` units as donors). p-value: |
| 68 | + |
| 69 | + p = (1/(J+1)) · Σ_{j=1}^{J+1} 𝟙₊(r_j − r_1) (fraction of units with ratio ≥ the treated unit's r_1) |
| 70 | + |
| 71 | + Alternative: use the distribution of post-period `R_j(T0+1,T)` after discarding placebos with pre-period `R_j(1,T0)` ≫ `R_1(1,T0)`. |
| 72 | +- **Confidence intervals by test inversion** (Firpo & Possebom 2018 cited) — invert the permutation test over hypothesized effect values. |
| 73 | +- **One-sided tests** via positive/negative parts `(Y_jt − Ŷ^N_jt)^±` of the gap → power gain (treated-unit-contaminated placebos tend to produce opposite-sign effects). |
| 74 | +- **Visualize** the permutation distribution of `r_j` or of placebo gaps `Y_jt − Ŷ^N_jt` (conveys magnitude, not just a p-value). |
| 75 | +- **Surveyed alternatives (citations — see dedicated reviews):** Chernozhukov-Wüthrich-Zhu (2021) **conformal inference** (time-permutation of constrained-LS residuals under the null, valid under residual **exchangeability**, weights re-estimated under the null using all periods); CWZ (2019b) bias-corrected CIs (asymptotically pivotal t-stat + cross-fitting, large `T0` and `T−T0`); Cattaneo-Feng-Titiunik **predictive intervals** (estimation + irreducible-error uncertainty); Hahn-Shi / Andrews (2003) **end-of-sample instability** test. |
| 76 | + |
| 77 | +*Edge cases / contextual requirements (Section 5 — the failure modes):* |
| 78 | +- **Effect size vs. volatility:** small effects are masked by volatile outcomes; high *unit-specific* volatility raises overfitting risk → consider de-noising/filtering (only unit-specific noise hurts; common-factor volatility is differenced out by the SC). |
| 79 | +- **No suitable comparison group:** exclude donors that (i) adopted a similar intervention, or (ii) suffered large idiosyncratic shocks not shared by the treated unit; restrict to comparable units (interpolation-bias control). |
| 80 | +- **Anticipation:** if agents react before formal implementation, **backdate** the intervention. Backdating does **not** mechanically bias the estimator because (1)/(3) allow time-varying effects (unlike constant-effect panel models). |
| 81 | +- **Interference / spillovers (SUTVA, Rubin 1980):** enforce in design (drop possibly-affected donors) or reason about the **sign of the bias** (e.g., negative spillover onto contributing donors → estimate is a *lower bound*). Sparsity + transparency of weights makes this feasible. |
| 82 | +- **Outcome transformations & a differencing pitfall:** level mismatch can be handled via differences, growth rates, or **demeaning** `Ȳ_jt = Y_jt − (1/T0)Σ_{h≤T0} Y_jh` (≡ Doudchenko-Imbens constant shift). **But** differencing inflates the noise variance when `ε_jt` is roughly independent in time → higher overfitting/bias; the differenced model retains the factor structure `ΔY^N_jt = Δδ_t + Δθ_t Z_j + Δλ_t μ_j + Δε_jt`. |
| 83 | +- **Short pre-period:** spurious (near-)perfect fit → unreliable counterfactual; mitigate with powerful non-outcome predictors (reduce residual variance). |
| 84 | +- **Structural breaks:** a long `T0` risks violating constant-factor-loadings; up-weight (`v_h`) the most recent predictors to alleviate. |
| 85 | +- **Time horizon:** effects may emerge slowly → need enough post-periods, or surrogate/leading indicators. |
| 86 | + |
| 87 | +*Sparsity (Section 4):* synthetic-control weights are **sparse** — when `X_1` is outside the donor convex hull and donors are in "general position," the solution is **unique with ≤ `k` nonzero weights** (projection of `X_1` onto the hull). Sparsity here is for **interpretability** (the identity/magnitude of nonzero weights matters), unlike lasso where sparsity is an anti-overfitting device. With many treated units inside the hull, weights may be non-unique (penalized SC restores uniqueness). |
| 88 | + |
| 89 | +**Reference implementation(s):** |
| 90 | +- Authors' `Synth` package for **R, MATLAB, and Stata** (Section 3.2 footnote; documented in Abadie, Diamond & Hainmueller 2011, *J. Stat. Software* 42(13)). |
| 91 | + |
| 92 | +**Requirements checklist (guidance this paper adds beyond 2010/2015):** |
| 93 | +- [ ] Convex-hull / "extreme treated unit" guard → warn / refuse when pre-period fit is poor or the treated unit is extreme. |
| 94 | +- [ ] `V`-selection: inverse-variance, nested-MSPE, and CV (with a documented `t0=T0/2`-style default + optional ridge `γΣv_h²` for non-uniqueness). |
| 95 | +- [ ] Encourage covariates in addition to lagged outcomes; allow pre-period-outcome summaries (sparsity). |
| 96 | +- [ ] Permutation inference: RMSPE-ratio p-value `(#{r_j≥r_1})/(J+1)`; one-sided variants; CI by test inversion; placebo-distribution visualization. |
| 97 | +- [ ] Weights computed from **pre-intervention data only** (design-phase guarantee). |
| 98 | +- [ ] Diagnostics: in-time placebo / backdating, leave-one-out, donor-pool & predictor robustness. |
| 99 | +- [ ] Warnings for the failure modes (volatility, contamination, anticipation, interference, differencing, short pre-period, structural breaks). |
| 100 | + |
| 101 | +--- |
| 102 | + |
| 103 | +## Implementation Notes |
| 104 | + |
| 105 | +### Data Structure Requirements |
| 106 | +- Aggregate panel (outcome + predictors) for the treated unit and a curated donor pool; **large pre-intervention window**; enough post-periods for the effect to manifest; balanced panel; single (or few) treated units with block timing. |
| 107 | + |
| 108 | +### Computational Considerations |
| 109 | +- Inner weight solve = constrained quadratic optimization over the simplex (Section 3.2 names it as such). |
| 110 | +- `V` selection adds an outer loop (nested-MSPE or CV-validation evaluation). Permutation inference re-runs estimation `J` times (one pseudo-treated donor each). |
| 111 | + |
| 112 | +### Tuning Parameters |
| 113 | + |
| 114 | +| Parameter | Type | Default guidance (this paper) | Selection Method | |
| 115 | +|-----------|------|-------------------------------|------------------| |
| 116 | +| `V` (predictor importance) | nonneg vector | data-driven | inverse-variance; nested pre-period MSPE; or CV (`t0=T0/2`); optional ridge `γΣv_h²` for non-uniqueness | |
| 117 | +| Predictors `X` | matrix | lagged outcomes + covariates | include real covariates; outcome summaries increase sparsity; data-driven via train/validation | |
| 118 | +| Donor pool | set | curated, similar units | exclude treated-like / shocked / dissimilar units; limit size (overfitting) | |
| 119 | +| Pre/post window | indices | as long a pre-window as structurally stable | backdate under anticipation; up-weight recent predictors under break risk | |
| 120 | + |
| 121 | +### Relation to Existing diff-diff Estimators |
| 122 | +- Same `SyntheticControl` estimator as the 2010/2015 reviews. This paper is the source for the **assumptions/warnings** and **edge-case** REGISTRY content and for the **formalized CV `V`-selection** (`t0=T0/2`) and the **CI-by-test-inversion / one-sided** inference refinements (relevant to PR-2/PR-3). |
| 123 | +- It positions **synthetic DiD (Arkhangelsky et al.)** — already implemented as `SyntheticDiD` — as "an SC that additionally weights pre-intervention time periods," confirming classic SCM is the unit-weights-only special case. |
| 124 | +- It positions **conformal inference (CWZ)** as the sampling-based complement to permutation inference — the basis for PR-3 (authoritative details in the CWZ review). |
| 125 | + |
| 126 | +--- |
| 127 | + |
| 128 | +## Gaps and Uncertainties |
| 129 | + |
| 130 | +- **No new estimator/algorithm numerics.** The inner solver, `V`-search routine, and starting values are not specified (referenced to AG 2003 / ADH 2010 and the `Synth` software). The CV `t0=T0/2` split is explicitly "heuristic." |
| 131 | +- **CV-weight non-uniqueness** is acknowledged (footnote 7) with a ridge remedy `γΣv_h²` but no default `γ`; an implementation must pick a deterministic tie-break. |
| 132 | +- **Surveyed inference methods are citation-level here.** The conformal recipe (CWZ), predictive intervals (Cattaneo et al.), and bias-corrected CIs (CWZ 2019b) are summarized but their exact algorithms/assumptions must come from the primary papers (CWZ 2021 is reviewed separately; the others are out of scope). |
| 133 | +- **Multiple treated units, penalized SC, bias correction, matrix completion** (Section 8) are surveyed (Eqs. 13–18 transcribed as Abadie presents them) but are **deferred** (augmented SC) or out of scope; not part of the classic-SCM implementation. |
| 134 | +- **Effect-size/volatility de-noising** (singular-value thresholding, Amjad-Shah-Shen) is mentioned as mitigation but not prescribed — a judgment call left to the analyst. |
| 135 | +- **"Extreme treated unit" / convex-hull check** is qualitative ("falls close to the convex hull") — a concrete numerical hull-distance or fit threshold for a warning must be chosen at implementation. |
0 commit comments