Skip to content

Commit 4bf60fa

Browse files
authored
Merge pull request #501 from igerber/feature/synthetic-control
2 parents c6ca2bb + fca42fe commit 4bf60fa

20 files changed

Lines changed: 3856 additions & 1 deletion

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
88
## [Unreleased]
99

1010
### Added
11+
- **New estimator: `SyntheticControl` — classic Synthetic Control Method (Abadie, Diamond & Hainmueller 2010; Abadie & Gardeazabal 2003).** Standalone estimator (`diff_diff/synthetic_control.py`) + `SyntheticControlResults` (`diff_diff/synthetic_control_results.py`) + `synthetic_control()` convenience function, exported from `diff_diff`. Builds a single treated unit's counterfactual as a convex combination of never-treated donor units — **donor (unit) weights only**, no time weights or ridge, distinct from `SyntheticDiD`. The inner simplex-constrained weighted-LS solve `W*(V)` reuses `utils._sc_weight_fw` (folding `V^½` into the predictor matrix, `intercept=False`, `zeta=0`); the diagonal predictor-importance matrix `V` is selected data-driven by minimizing pre-period outcome MSPE (`v_method="nested"`, softmax-on-simplex multistart Nelder-Mead + Powell polish) or supplied by the user (`v_method="custom"`). Predictors are built from `predictors`/`predictor_window`/`predictors_op`, `special_predictors`, and per-period outcome lags (`pre_period_outcomes`), in the R `Synth::dataprep` row order; per-row standardization (SD over donors+treated, ddof=1) matches the R `Synth::synth` source. Reports the gap path (`α̂_1t = Y_1t − Σ_j w_j Y_jt`), `att` (mean post-period gap), `pre_rmspe`, donor weights, `v_weights`, and a predictor-balance table. **No analytical standard error** — `se`/`t_stat`/`p_value`/`conf_int` are NaN (in-space placebo permutation inference with the post/pre RMSPE-ratio statistic is planned for a follow-up release; `_placebo_gaps`/`_rmspe_ratio`/`_fit_snapshot` are reserved on the results object). Ten validation gates baked in: predictor-period leakage, absorbing post-period suffix + no-anticipation cross-check against the treatment column, post-period canonicalization, donor-pool filtering before period derivation, empty-window rejection, poor-pre-fit `UserWarning` (RMSPE > SD of treated pre-outcomes), duplicate-predictor-label rejection, inner-solve non-convergence warning, order-independent gap-path rebuild, and the `standardize="none"` deviation; plus fail-closed `custom_v` cross-field rules and degenerate single-donor / single-pre-period handling. **R-`Synth` parity** (`tests/test_methodology_synthetic_control.py`, fixtures generated by `benchmarks/R/generate_synth_basque_golden.R` into `tests/data/`): two-tier on the Basque Country study — Tier-1 feeds R's `solution.v` via `custom_v` and reproduces the published donor weights (region 10 Cataluña 0.851 + region 14 Madrid 0.149) to `atol=1e-3` deterministically; Tier-2 (`@pytest.mark.slow`) checks the data-driven nested fit lands in a tolerance band (the nested `V` legitimately differs because the outer objective uses all pre periods, not R's `time.optimize.ssr` window). Documented in `docs/methodology/REGISTRY.md` §SyntheticControl (with `**Deviation from R:** standardize="none"` and `**Note:**` labels for the standardization formula, objective window, softmax `V` parametrization, and 1×SD poor-fit threshold), `docs/api/synthetic_control.rst`, the LLM guides, and `README.md`.
1112
- **ConleySpatialHAC methodology-review-tracker promotion: In Progress → Complete.** Closes the Conley (1999) *Journal of Econometrics* 92(1) primary-source review on the methodology-review tracker. The paper review on file at `docs/methodology/papers/conley-1999-review.md` was previously merged (2026-05-09); this PR is the F.L.I.P. consolidation — new `tests/test_methodology_conley.py` with paper-equation-numbered Verified Components walk-through (~1600 LoC; 10 classes; 60 tests, 5 of them `@pytest.mark.slow`). Coverage: Eq. 4.2 cross-sectional sandwich (pairwise-distance specialization; the project's paper review identifies Eq. 4.2 page 18 as the real-valued/pairwise form, with Eq. 3.13 reserved for the lattice-indexed form), Eq. 4.2 HC0 + rank-1 limits, Andrews (1991) HAC lag truncation matching `conleyreg::time_dist.cpp`, haversine convention with Earth radius 6371.01 km, Phase 2 panel block-decomposed sandwich at `atol=1e-12`, sparse k-d-tree dense-vs-sparse bit-identity (Wave A #120 numerical correctness), and R `conleyreg` v0.1.9 parity at `atol=1e-6` on 6 fixtures (3 cross-sectional + 3 panel) plus the sparse-forced and time-asymmetric kernel parity contracts. Three dedicated deviations-area classes: `TestConleyLibraryExtensions` (Wave A library extensions — combined spatial+cluster product kernel #119, callable conley_metric validation #123, sparse k-d-tree activation #120, indefiniteness guard), `TestConleyDeviationsFromR` (1-D radial Bartlett vs paper's 2-D separable Eq. 3.14, time-label normalization via `np.unique`, independent temporal kernel deferred), and `TestConleyDeferrals` (5 fail-closed `NotImplementedError`/`TypeError` contracts: LinearRegression + survey_design, DiD/MPD/TWFE + survey_design, Conley + weights, SyntheticDiD + Conley, wild_bootstrap + Conley). Methodology-anchored tests extracted from `tests/test_conley_vcov.py`: full classes `TestConleyDirectHelper`, `TestConleyReductions`, `TestConleyReductionsAddendum`, `TestConleyParityR`, `TestConleyParitySpacetime`, `TestConleyPanelHelper`, `TestConleySparseRParityForced`; plus methodology-anchored tests from `TestConleyKernels`, `TestConleyDistanceMetrics`, `TestConleySparse`. File drops 4248 → 3113 lines after extraction. Defensive surface preserved: input validation, NaN/inf guards, dispatch-level validity, estimator-level integration smoke tests, set_params atomicity, sparse-path activation thresholds + density-gate fallback. `METHODOLOGY_REVIEW.md` row L91 promoted to **Complete** with `Last Review = 2026-05-26`; detail block rewritten with Verified Components / Test Coverage / R Comparison Results inline table / Corrections Made / Deviations / Outstanding Concerns. Priority queue at L1386 pruned: PreTrendsPower removed (already Complete since 2026-05-19) and ConleySpatialHAC removed (this PR); substantive-review-blocked renumbered #2-#5 → #1-#4 and consolidation-pass-blocked renumbered #6-#8 → #5-#6.
1213

1314
### Added / Changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,7 @@ Full guide: `diff_diff.get_llm_guide("practitioner")`.
108108
- [TwoStageDiD](https://diff-diff.readthedocs.io/en/stable/api/two_stage.html) - Gardner (2022) two-stage estimator with GMM sandwich variance
109109
- [SpilloverDiD](https://diff-diff.readthedocs.io/en/stable/api/spillover.html) - Butts (2021) ring-indicator spillover-aware DiD identifying direct effect on treated + per-ring spillover on near-control units; handles non-staggered and staggered timing; supports survey-design variance under `survey_design=` for HC1 / CR1 (Wave E.1 Binder TSL) and Conley (Wave E.2 panel-aware stratified-Conley sandwich on per-period PSU totals; extended in Wave E.2 follow-up to `conley_lag_cutoff > 0` via panel-block composition with within-PSU serial Bartlett HAC — `lag>0` requires an effective PSU via explicit `survey_design.psu` or injected `cluster=<col>`); `SurveyDesign.subpopulation()` preserves full-design `n_psu` / `df_survey` via zero-padded scores (Wave E.3, R `svyrecvar(subset())` form)
110110
- [SyntheticDiD](https://diff-diff.readthedocs.io/en/stable/api/estimators.html) - Synthetic DiD combining standard DiD and synthetic control for few treated units
111+
- [SyntheticControl](https://diff-diff.readthedocs.io/en/stable/api/synthetic_control.html) - Abadie, Diamond & Hainmueller (2010) classic synthetic control for a single treated unit (donor-weight counterfactual, nested/custom V; no inference in this release — permutation/placebo planned)
111112
- [TripleDifference](https://diff-diff.readthedocs.io/en/stable/api/triple_diff.html) - triple difference (DDD) estimator for designs requiring two criteria for treatment eligibility
112113
- [ContinuousDiD](https://diff-diff.readthedocs.io/en/stable/api/continuous_did.html) - Callaway, Goodman-Bacon & Sant'Anna (2024) continuous treatment DiD with dose-response curves
113114
- [HeterogeneousAdoptionDiD](https://diff-diff.readthedocs.io/en/stable/api/had.html) - de Chaisemartin, Ciccia, D'Haultfœuille & Knau (2026) for designs where **no unit remains untreated**; local-linear estimator at the dose support boundary returning Weighted Average Slope (WAS) on Design 1' (`d̲ = 0` / QUG) or `WAS_{d̲}` on Design 1 (`d̲ > 0`, continuous-near-d̲ or mass-point), with a multi-period event-study extension (last-treatment cohort, pointwise CIs). **Panel-only** in this release - repeated cross-sections rejected by the validator. Alias `HAD`.

TODO.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -84,6 +84,7 @@ Deferred items from PR reviews that were not addressed before merge.
8484
| ImputationDiD dense `(A0'A0).toarray()` scales O((U+T+K)^2), OOM risk on large panels | `imputation.py` | #141 | Medium (deferred — only triggers when sparse solver fails) |
8585
| Multi-absorb weighted demeaning needs iterative alternating projections for N > 1 absorbed FE with survey weights; unweighted multi-absorb also uses single-pass (pre-existing, exact only for balanced panels) | `estimators.py` | #218 | Medium |
8686
| Survey design resolution/collapse patterns are inconsistent across panel estimators — ContinuousDiD rebuilds unit-level design in SE code, EfficientDiD builds once in fit(), StackedDiD re-resolves on stacked data; extract shared helpers for panel-to-unit collapse, post-filter re-resolution, and metadata recomputation | `continuous_did.py`, `efficient_did.py`, `stacked_did.py` | #226 | Low |
87+
| SyntheticControl: `SyntheticControlResults` not wired into the practitioner / DiagnosticReport / BusinessReport routing, so routing SCM results through those tools yields generic parallel-trends/HonestDiD guidance that doesn't fit SCM. Add SCM to the native-routed rejection sets (mirror SDiD/TROP) and surface SCM-native diagnostics (pre-fit / in-space placebo / in-time placebo / leave-one-out). Deferred to PR-2, where it pairs with the placebo-inference layer those reports would surface. | `practitioner.py`, `diagnostic_report.py`, `business_report.py` | SCM PR-1 → PR-2 | Medium |
8788
| ContinuousDiD deferred CGBS 2024 extensions: (a) `covariates=` kwarg not implemented (matches R `contdid` v0.1.0); (b) discrete-treatment saturated regression deferred (integer-valued dose currently warned, not routed to per-level coefficients); (c) lowest-dose-as-control per CGBS 2024 Remark 3.1 (when `P(D=0) = 0`) not implemented — estimator requires never-treated controls. REGISTRY `## ContinuousDiD` → Implementation Checklist marks these as deferred `[ ]` items. | `diff_diff/continuous_did.py` || Low |
8889
| Survey-weighted Silverman bandwidth in EfficientDiD conditional Omega*`_silverman_bandwidth()` uses unweighted mean/std for bandwidth selection; survey-weighted statistics would better reflect the population distribution but is a second-order refinement | `efficient_did_covariates.py` || Low |
8990
| TROP: extend Wave 4's `_setup_trop_data` helper to also cover the duplicated bootstrap resampling loop in `_bootstrap_variance` / `_bootstrap_variance_global` (~40 LoC dedup; mirrors the data-setup helper pattern with a `fit_callable` parameter for the per-draw refit step). | `trop_local.py`, `trop_global.py` | follow-up | Low |
Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
#!/usr/bin/env Rscript
2+
# Generate the Basque Country (Abadie & Gardeazabal 2003) R `Synth` golden fixture
3+
# for the SyntheticControl estimator's two-tier R-parity test.
4+
#
5+
# Run from the repo root:
6+
# Rscript benchmarks/R/generate_synth_basque_golden.R
7+
#
8+
# Writes (into tests/data/ so the deterministic Tier-1 parity test runs in
9+
# isolated-install CI without R):
10+
# tests/data/synth_basque_panel.csv verbatim Synth::basque, regions != 1
11+
# (Spain aggregate dropped), long format,
12+
# plus an absorbing `treated` indicator.
13+
# tests/data/synth_basque_golden.json R Synth solution.v / solution.w, losses,
14+
# the standardization divisor, X1/X0, and
15+
# the treated/synthetic/gap paths.
16+
#
17+
# Provenance: the panel is a verbatim export of R `Synth::basque`; the V-selection
18+
# numerics (standardization divisor, optimizer) are pinned from the `Synth` source,
19+
# not from Abadie-Diamond-Hainmueller (2010) — see docs/methodology/REGISTRY.md.
20+
21+
suppressMessages({
22+
library(Synth)
23+
library(jsonlite)
24+
})
25+
26+
data(basque)
27+
28+
predictors <- c(
29+
"school.illit", "school.prim", "school.med",
30+
"school.high", "school.post.high", "invest"
31+
)
32+
special <- list(
33+
list("gdpcap", 1960:1969, "mean"),
34+
list("sec.agriculture", seq(1961, 1969, 2), "mean"),
35+
list("sec.energy", seq(1961, 1969, 2), "mean"),
36+
list("sec.industry", seq(1961, 1969, 2), "mean"),
37+
list("sec.construction", seq(1961, 1969, 2), "mean"),
38+
list("sec.services.venta", seq(1961, 1969, 2), "mean"),
39+
list("sec.services.nonventa", seq(1961, 1969, 2), "mean"),
40+
list("popdens", 1969, "mean")
41+
)
42+
controls <- c(2:16, 18)
43+
44+
invisible(capture.output({
45+
dp <- dataprep(
46+
foo = basque,
47+
predictors = predictors,
48+
predictors.op = "mean",
49+
time.predictors.prior = 1964:1969,
50+
special.predictors = special,
51+
dependent = "gdpcap",
52+
unit.variable = "regionno",
53+
unit.names.variable = "regionname",
54+
time.variable = "year",
55+
treatment.identifier = 17,
56+
controls.identifier = controls,
57+
time.optimize.ssr = 1960:1969,
58+
time.plot = 1955:1997
59+
)
60+
so <- synth(dp)
61+
}))
62+
63+
# Standardization divisor exactly as computed inside synth():
64+
# divisor <- sqrt(apply(cbind(X0, X1), 1, var))
65+
big <- cbind(dp$X0, dp$X1)
66+
divisor <- sqrt(apply(big, 1, var))
67+
68+
pred_names <- rownames(dp$X1)
69+
v <- as.numeric(so$solution.v)
70+
w <- as.numeric(so$solution.w)
71+
72+
# X0 as predictor -> {control -> value} so Python can verify matrix construction.
73+
X0_list <- setNames(
74+
lapply(seq_len(nrow(dp$X0)), function(i) as.list(setNames(dp$X0[i, ], colnames(dp$X0)))),
75+
pred_names
76+
)
77+
78+
synthetic_path <- as.numeric(dp$Y0plot %*% so$solution.w)
79+
treated_path <- as.numeric(dp$Y1plot)
80+
years <- as.integer(rownames(dp$Y1plot))
81+
82+
golden <- list(
83+
config = list(
84+
treated_regionno = 17,
85+
controls = controls,
86+
treatment_year = 1970,
87+
predictors = predictors,
88+
predictors_op = "mean",
89+
predictor_window = 1964:1969,
90+
special = lapply(special, function(s) {
91+
list(var = s[[1]], periods = s[[2]], op = s[[3]])
92+
}),
93+
time_optimize_ssr = 1960:1969,
94+
time_plot = c(1955, 1997)
95+
),
96+
predictor_names = pred_names,
97+
solution_v = setNames(v, pred_names),
98+
solution_w = as.list(setNames(w, colnames(dp$X0))),
99+
loss_v = as.numeric(so$loss.v),
100+
loss_w = as.numeric(so$loss.w),
101+
divisor = setNames(as.numeric(divisor), pred_names),
102+
X1 = setNames(as.numeric(dp$X1), pred_names),
103+
X0 = X0_list,
104+
years = years,
105+
treated_path = treated_path,
106+
synthetic_path = synthetic_path,
107+
gap = treated_path - synthetic_path
108+
)
109+
110+
dir.create("tests/data", showWarnings = FALSE, recursive = TRUE)
111+
write_json(
112+
golden, "tests/data/synth_basque_golden.json",
113+
auto_unbox = TRUE, digits = 12, pretty = TRUE
114+
)
115+
116+
# Panel CSV: drop region 1 (Spain aggregate); long format + absorbing treated.
117+
panel <- basque[basque$regionno != 1, ]
118+
panel$treated <- as.integer(panel$regionno == 17 & panel$year >= 1970)
119+
stopifnot(!any(is.na(panel$gdpcap))) # outcome must be complete (balanced panel)
120+
write.csv(panel, "tests/data/synth_basque_panel.csv", row.names = FALSE)
121+
122+
cat("Wrote tests/data/synth_basque_golden.json and synth_basque_panel.csv\n")
123+
cat("nvarsV:", length(v), " n_controls:", length(w), "\n")
124+
cat("loss.v:", format(so$loss.v, digits = 6), " loss.w:", format(so$loss.w, digits = 6), "\n")
125+
nz <- setNames(round(w, 4), colnames(dp$X0))
126+
cat("solution.w (nonzero):\n")
127+
print(nz[nz > 1e-4])

benchmarks/R/requirements.R

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ required_packages <- c(
1717
"DIDHAD", # de Chaisemartin et al. (2025) HAD estimator (HAD Phase 4 R-parity)
1818
"YatchewTest", # Yatchew (1997) linearity test (HAD yatchew R-parity)
1919
"nprobust", # Calonico-Cattaneo-Farrell local-linear (DIDHAD dependency)
20+
"Synth", # Abadie-Diamond-Hainmueller (2010) synthetic control (SyntheticControl R-parity; ships data(basque))
2021

2122
# Utilities
2223
"jsonlite", # JSON output for Python interop

diff_diff/__init__.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -222,6 +222,11 @@
222222
TROPResults,
223223
trop,
224224
)
225+
from diff_diff.synthetic_control import (
226+
SyntheticControl,
227+
synthetic_control,
228+
)
229+
from diff_diff.synthetic_control_results import SyntheticControlResults
225230
from diff_diff.wooldridge import WooldridgeDiD
226231
from diff_diff.wooldridge_results import WooldridgeDiDResults
227232
from diff_diff.utils import (
@@ -309,6 +314,7 @@
309314
"SpilloverDiD",
310315
"TripleDifference",
311316
"TROP",
317+
"SyntheticControl",
312318
"StackedDiD",
313319
# Estimator aliases (short names)
314320
"DiD",
@@ -355,6 +361,8 @@
355361
"StaggeredTripleDiffResults",
356362
"TROPResults",
357363
"trop",
364+
"SyntheticControlResults",
365+
"synthetic_control",
358366
"StackedDiDResults",
359367
"stacked_did",
360368
# EfficientDiD

diff_diff/estimators.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
Additional estimators are in separate modules:
99
- TwoWayFixedEffects: See diff_diff.twfe
1010
- SyntheticDiD: See diff_diff.synthetic_did
11+
- SyntheticControl: See diff_diff.synthetic_control
1112
1213
For backward compatibility, all estimators are re-exported from this module.
1314
"""
@@ -2042,6 +2043,8 @@ def summary(self) -> str:
20422043
# These can also be imported directly from their respective modules:
20432044
# - from diff_diff.twfe import TwoWayFixedEffects
20442045
# - from diff_diff.synthetic_did import SyntheticDiD
2046+
# - from diff_diff.synthetic_control import SyntheticControl
2047+
from diff_diff.synthetic_control import SyntheticControl # noqa: E402
20452048
from diff_diff.synthetic_did import SyntheticDiD # noqa: E402
20462049
from diff_diff.twfe import TwoWayFixedEffects # noqa: E402
20472050

@@ -2050,4 +2053,5 @@ def summary(self) -> str:
20502053
"MultiPeriodDiD",
20512054
"TwoWayFixedEffects",
20522055
"SyntheticDiD",
2056+
"SyntheticControl",
20532057
]

0 commit comments

Comments
 (0)