igerber
diff --git a/‎CHANGELOG.md‎
Lines changed: 1 addition & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎README.md‎
Lines changed: 1 addition & 0 deletions b/‎README.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎TODO.md‎
Lines changed: 1 addition & 0 deletions b/‎TODO.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎benchmarks/R/generate_synth_basque_golden.R‎
Lines changed: 127 additions & 0 deletions b/‎benchmarks/R/generate_synth_basque_golden.R‎
Lines changed: 127 additions & 0 deletions
diff --git a/‎benchmarks/R/requirements.R‎
Lines changed: 1 addition & 0 deletions b/‎benchmarks/R/requirements.R‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎diff_diff/__init__.py‎
Lines changed: 8 additions & 0 deletions b/‎diff_diff/__init__.py‎
Lines changed: 8 additions & 0 deletions
diff --git a/‎diff_diff/estimators.py‎
Lines changed: 4 additions & 0 deletions b/‎diff_diff/estimators.py‎
Lines changed: 4 additions & 0 deletions
@@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
 
 ### Added
+- **New estimator: `SyntheticControl` — classic Synthetic Control Method (Abadie, Diamond & Hainmueller 2010; Abadie & Gardeazabal 2003).** Standalone estimator (`diff_diff/synthetic_control.py`) + `SyntheticControlResults` (`diff_diff/synthetic_control_results.py`) + `synthetic_control()` convenience function, exported from `diff_diff`. Builds a single treated unit's counterfactual as a convex combination of never-treated donor units — **donor (unit) weights only**, no time weights or ridge, distinct from `SyntheticDiD`. The inner simplex-constrained weighted-LS solve `W*(V)` reuses `utils._sc_weight_fw` (folding `V^½` into the predictor matrix, `intercept=False`, `zeta=0`); the diagonal predictor-importance matrix `V` is selected data-driven by minimizing pre-period outcome MSPE (`v_method="nested"`, softmax-on-simplex multistart Nelder-Mead + Powell polish) or supplied by the user (`v_method="custom"`). Predictors are built from `predictors`/`predictor_window`/`predictors_op`, `special_predictors`, and per-period outcome lags (`pre_period_outcomes`), in the R `Synth::dataprep` row order; per-row standardization (SD over donors+treated, ddof=1) matches the R `Synth::synth` source. Reports the gap path (`α̂_1t = Y_1t − Σ_j w_j Y_jt`), `att` (mean post-period gap), `pre_rmspe`, donor weights, `v_weights`, and a predictor-balance table. **No analytical standard error** — `se`/`t_stat`/`p_value`/`conf_int` are NaN (in-space placebo permutation inference with the post/pre RMSPE-ratio statistic is planned for a follow-up release; `_placebo_gaps`/`_rmspe_ratio`/`_fit_snapshot` are reserved on the results object). Ten validation gates baked in: predictor-period leakage, absorbing post-period suffix + no-anticipation cross-check against the treatment column, post-period canonicalization, donor-pool filtering before period derivation, empty-window rejection, poor-pre-fit `UserWarning` (RMSPE > SD of treated pre-outcomes), duplicate-predictor-label rejection, inner-solve non-convergence warning, order-independent gap-path rebuild, and the `standardize="none"` deviation; plus fail-closed `custom_v` cross-field rules and degenerate single-donor / single-pre-period handling. **R-`Synth` parity** (`tests/test_methodology_synthetic_control.py`, fixtures generated by `benchmarks/R/generate_synth_basque_golden.R` into `tests/data/`): two-tier on the Basque Country study — Tier-1 feeds R's `solution.v` via `custom_v` and reproduces the published donor weights (region 10 Cataluña 0.851 + region 14 Madrid 0.149) to `atol=1e-3` deterministically; Tier-2 (`@pytest.mark.slow`) checks the data-driven nested fit lands in a tolerance band (the nested `V` legitimately differs because the outer objective uses all pre periods, not R's `time.optimize.ssr` window). Documented in `docs/methodology/REGISTRY.md` §SyntheticControl (with `**Deviation from R:** standardize="none"` and `**Note:**` labels for the standardization formula, objective window, softmax `V` parametrization, and 1×SD poor-fit threshold), `docs/api/synthetic_control.rst`, the LLM guides, and `README.md`.
 - **ConleySpatialHAC methodology-review-tracker promotion: In Progress → Complete.** Closes the Conley (1999) *Journal of Econometrics* 92(1) primary-source review on the methodology-review tracker. The paper review on file at `docs/methodology/papers/conley-1999-review.md` was previously merged (2026-05-09); this PR is the F.L.I.P. consolidation — new `tests/test_methodology_conley.py` with paper-equation-numbered Verified Components walk-through (~1600 LoC; 10 classes; 60 tests, 5 of them `@pytest.mark.slow`). Coverage: Eq. 4.2 cross-sectional sandwich (pairwise-distance specialization; the project's paper review identifies Eq. 4.2 page 18 as the real-valued/pairwise form, with Eq. 3.13 reserved for the lattice-indexed form), Eq. 4.2 HC0 + rank-1 limits, Andrews (1991) HAC lag truncation matching `conleyreg::time_dist.cpp`, haversine convention with Earth radius 6371.01 km, Phase 2 panel block-decomposed sandwich at `atol=1e-12`, sparse k-d-tree dense-vs-sparse bit-identity (Wave A #120 numerical correctness), and R `conleyreg` v0.1.9 parity at `atol=1e-6` on 6 fixtures (3 cross-sectional + 3 panel) plus the sparse-forced and time-asymmetric kernel parity contracts. Three dedicated deviations-area classes: `TestConleyLibraryExtensions` (Wave A library extensions — combined spatial+cluster product kernel #119, callable conley_metric validation #123, sparse k-d-tree activation #120, indefiniteness guard), `TestConleyDeviationsFromR` (1-D radial Bartlett vs paper's 2-D separable Eq. 3.14, time-label normalization via `np.unique`, independent temporal kernel deferred), and `TestConleyDeferrals` (5 fail-closed `NotImplementedError`/`TypeError` contracts: LinearRegression + survey_design, DiD/MPD/TWFE + survey_design, Conley + weights, SyntheticDiD + Conley, wild_bootstrap + Conley). Methodology-anchored tests extracted from `tests/test_conley_vcov.py`: full classes `TestConleyDirectHelper`, `TestConleyReductions`, `TestConleyReductionsAddendum`, `TestConleyParityR`, `TestConleyParitySpacetime`, `TestConleyPanelHelper`, `TestConleySparseRParityForced`; plus methodology-anchored tests from `TestConleyKernels`, `TestConleyDistanceMetrics`, `TestConleySparse`. File drops 4248 → 3113 lines after extraction. Defensive surface preserved: input validation, NaN/inf guards, dispatch-level validity, estimator-level integration smoke tests, set_params atomicity, sparse-path activation thresholds + density-gate fallback. `METHODOLOGY_REVIEW.md` row L91 promoted to **Complete** with `Last Review = 2026-05-26`; detail block rewritten with Verified Components / Test Coverage / R Comparison Results inline table / Corrections Made / Deviations / Outstanding Concerns. Priority queue at L1386 pruned: PreTrendsPower removed (already Complete since 2026-05-19) and ConleySpatialHAC removed (this PR); substantive-review-blocked renumbered #2-#5 → #1-#4 and consolidation-pass-blocked renumbered #6-#8 → #5-#6.
 
 ### Added / Changed
 
@@ -108,6 +108,7 @@ Full guide: `diff_diff.get_llm_guide("practitioner")`.
 - [TwoStageDiD](https://diff-diff.readthedocs.io/en/stable/api/two_stage.html) - Gardner (2022) two-stage estimator with GMM sandwich variance
 - [SpilloverDiD](https://diff-diff.readthedocs.io/en/stable/api/spillover.html) - Butts (2021) ring-indicator spillover-aware DiD identifying direct effect on treated + per-ring spillover on near-control units; handles non-staggered and staggered timing; supports survey-design variance under `survey_design=` for HC1 / CR1 (Wave E.1 Binder TSL) and Conley (Wave E.2 panel-aware stratified-Conley sandwich on per-period PSU totals; extended in Wave E.2 follow-up to `conley_lag_cutoff > 0` via panel-block composition with within-PSU serial Bartlett HAC — `lag>0` requires an effective PSU via explicit `survey_design.psu` or injected `cluster=<col>`); `SurveyDesign.subpopulation()` preserves full-design `n_psu` / `df_survey` via zero-padded scores (Wave E.3, R `svyrecvar(subset())` form)
 - [SyntheticDiD](https://diff-diff.readthedocs.io/en/stable/api/estimators.html) - Synthetic DiD combining standard DiD and synthetic control for few treated units
+- [SyntheticControl](https://diff-diff.readthedocs.io/en/stable/api/synthetic_control.html) - Abadie, Diamond & Hainmueller (2010) classic synthetic control for a single treated unit (donor-weight counterfactual, nested/custom V; no inference in this release — permutation/placebo planned)
 - [TripleDifference](https://diff-diff.readthedocs.io/en/stable/api/triple_diff.html) - triple difference (DDD) estimator for designs requiring two criteria for treatment eligibility
 - [ContinuousDiD](https://diff-diff.readthedocs.io/en/stable/api/continuous_did.html) - Callaway, Goodman-Bacon & Sant'Anna (2024) continuous treatment DiD with dose-response curves
 - [HeterogeneousAdoptionDiD](https://diff-diff.readthedocs.io/en/stable/api/had.html) - de Chaisemartin, Ciccia, D'Haultfœuille & Knau (2026) for designs where **no unit remains untreated**; local-linear estimator at the dose support boundary returning Weighted Average Slope (WAS) on Design 1' (`d̲ = 0` / QUG) or `WAS_{d̲}` on Design 1 (`d̲ > 0`, continuous-near-d̲ or mass-point), with a multi-period event-study extension (last-treatment cohort, pointwise CIs). **Panel-only** in this release - repeated cross-sections rejected by the validator. Alias `HAD`.
 
@@ -84,6 +84,7 @@ Deferred items from PR reviews that were not addressed before merge.
 | ImputationDiD dense `(A0'A0).toarray()` scales O((U+T+K)^2), OOM risk on large panels | `imputation.py` | #141 | Medium (deferred — only triggers when sparse solver fails) |
 | Multi-absorb weighted demeaning needs iterative alternating projections for N > 1 absorbed FE with survey weights; unweighted multi-absorb also uses single-pass (pre-existing, exact only for balanced panels) | `estimators.py` | #218 | Medium |
 | Survey design resolution/collapse patterns are inconsistent across panel estimators — ContinuousDiD rebuilds unit-level design in SE code, EfficientDiD builds once in fit(), StackedDiD re-resolves on stacked data; extract shared helpers for panel-to-unit collapse, post-filter re-resolution, and metadata recomputation | `continuous_did.py`, `efficient_did.py`, `stacked_did.py` | #226 | Low |
+| SyntheticControl: `SyntheticControlResults` not wired into the practitioner / DiagnosticReport / BusinessReport routing, so routing SCM results through those tools yields generic parallel-trends/HonestDiD guidance that doesn't fit SCM. Add SCM to the native-routed rejection sets (mirror SDiD/TROP) and surface SCM-native diagnostics (pre-fit / in-space placebo / in-time placebo / leave-one-out). Deferred to PR-2, where it pairs with the placebo-inference layer those reports would surface. | `practitioner.py`, `diagnostic_report.py`, `business_report.py` | SCM PR-1 → PR-2 | Medium |
 | ContinuousDiD deferred CGBS 2024 extensions: (a) `covariates=` kwarg not implemented (matches R `contdid` v0.1.0); (b) discrete-treatment saturated regression deferred (integer-valued dose currently warned, not routed to per-level coefficients); (c) lowest-dose-as-control per CGBS 2024 Remark 3.1 (when `P(D=0) = 0`) not implemented — estimator requires never-treated controls. REGISTRY `## ContinuousDiD` → Implementation Checklist marks these as deferred `[ ]` items. | `diff_diff/continuous_did.py` | — | Low |
 | Survey-weighted Silverman bandwidth in EfficientDiD conditional Omega* — `_silverman_bandwidth()` uses unweighted mean/std for bandwidth selection; survey-weighted statistics would better reflect the population distribution but is a second-order refinement | `efficient_did_covariates.py` | — | Low |
 | TROP: extend Wave 4's `_setup_trop_data` helper to also cover the duplicated bootstrap resampling loop in `_bootstrap_variance` / `_bootstrap_variance_global` (~40 LoC dedup; mirrors the data-setup helper pattern with a `fit_callable` parameter for the per-draw refit step). | `trop_local.py`, `trop_global.py` | follow-up | Low |
 
@@ -0,0 +1,127 @@
+#!/usr/bin/env Rscript
+# Generate the Basque Country (Abadie & Gardeazabal 2003) R `Synth` golden fixture
+# for the SyntheticControl estimator's two-tier R-parity test.
+#
+# Run from the repo root:
+#   Rscript benchmarks/R/generate_synth_basque_golden.R
+#
+# Writes (into tests/data/ so the deterministic Tier-1 parity test runs in
+# isolated-install CI without R):
+#   tests/data/synth_basque_panel.csv   verbatim Synth::basque, regions != 1
+#                                        (Spain aggregate dropped), long format,
+#                                        plus an absorbing `treated` indicator.
+#   tests/data/synth_basque_golden.json  R Synth solution.v / solution.w, losses,
+#                                        the standardization divisor, X1/X0, and
+#                                        the treated/synthetic/gap paths.
+#
+# Provenance: the panel is a verbatim export of R `Synth::basque`; the V-selection
+# numerics (standardization divisor, optimizer) are pinned from the `Synth` source,
+# not from Abadie-Diamond-Hainmueller (2010) — see docs/methodology/REGISTRY.md.
+
+suppressMessages({
+  library(Synth)
+  library(jsonlite)
+})
+
+data(basque)
+
+predictors <- c(
+  "school.illit", "school.prim", "school.med",
+  "school.high", "school.post.high", "invest"
+)
+special <- list(
+  list("gdpcap", 1960:1969, "mean"),
+  list("sec.agriculture", seq(1961, 1969, 2), "mean"),
+  list("sec.energy", seq(1961, 1969, 2), "mean"),
+  list("sec.industry", seq(1961, 1969, 2), "mean"),
+  list("sec.construction", seq(1961, 1969, 2), "mean"),
+  list("sec.services.venta", seq(1961, 1969, 2), "mean"),
+  list("sec.services.nonventa", seq(1961, 1969, 2), "mean"),
+  list("popdens", 1969, "mean")
+)
+controls <- c(2:16, 18)
+
+invisible(capture.output({
+  dp <- dataprep(
+    foo = basque,
+    predictors = predictors,
+    predictors.op = "mean",
+    time.predictors.prior = 1964:1969,
+    special.predictors = special,
+    dependent = "gdpcap",
+    unit.variable = "regionno",
+    unit.names.variable = "regionname",
+    time.variable = "year",
+    treatment.identifier = 17,
+    controls.identifier = controls,
+    time.optimize.ssr = 1960:1969,
+    time.plot = 1955:1997
+  )
+  so <- synth(dp)
+}))
+
+# Standardization divisor exactly as computed inside synth():
+#   divisor <- sqrt(apply(cbind(X0, X1), 1, var))
+big <- cbind(dp$X0, dp$X1)
+divisor <- sqrt(apply(big, 1, var))
+
+pred_names <- rownames(dp$X1)
+v <- as.numeric(so$solution.v)
+w <- as.numeric(so$solution.w)
+
+# X0 as predictor -> {control -> value} so Python can verify matrix construction.
+X0_list <- setNames(
+  lapply(seq_len(nrow(dp$X0)), function(i) as.list(setNames(dp$X0[i, ], colnames(dp$X0)))),
+  pred_names
+)
+
+synthetic_path <- as.numeric(dp$Y0plot %*% so$solution.w)
+treated_path <- as.numeric(dp$Y1plot)
+years <- as.integer(rownames(dp$Y1plot))
+
+golden <- list(
+  config = list(
+    treated_regionno = 17,
+    controls = controls,
+    treatment_year = 1970,
+    predictors = predictors,
+    predictors_op = "mean",
+    predictor_window = 1964:1969,
+    special = lapply(special, function(s) {
+      list(var = s[[1]], periods = s[[2]], op = s[[3]])
+    }),
+    time_optimize_ssr = 1960:1969,
+    time_plot = c(1955, 1997)
+  ),
+  predictor_names = pred_names,
+  solution_v = setNames(v, pred_names),
+  solution_w = as.list(setNames(w, colnames(dp$X0))),
+  loss_v = as.numeric(so$loss.v),
+  loss_w = as.numeric(so$loss.w),
+  divisor = setNames(as.numeric(divisor), pred_names),
+  X1 = setNames(as.numeric(dp$X1), pred_names),
+  X0 = X0_list,
+  years = years,
+  treated_path = treated_path,
+  synthetic_path = synthetic_path,
+  gap = treated_path - synthetic_path
+)
+
+dir.create("tests/data", showWarnings = FALSE, recursive = TRUE)
+write_json(
+  golden, "tests/data/synth_basque_golden.json",
+  auto_unbox = TRUE, digits = 12, pretty = TRUE
+)
+
+# Panel CSV: drop region 1 (Spain aggregate); long format + absorbing treated.
+panel <- basque[basque$regionno != 1, ]
+panel$treated <- as.integer(panel$regionno == 17 & panel$year >= 1970)
+stopifnot(!any(is.na(panel$gdpcap)))  # outcome must be complete (balanced panel)
+write.csv(panel, "tests/data/synth_basque_panel.csv", row.names = FALSE)
+
+cat("Wrote tests/data/synth_basque_golden.json and synth_basque_panel.csv\n")
+cat("nvarsV:", length(v), "  n_controls:", length(w), "\n")
+cat("loss.v:", format(so$loss.v, digits = 6), "  loss.w:", format(so$loss.w, digits = 6), "\n")
+nz <- setNames(round(w, 4), colnames(dp$X0))
+cat("solution.w (nonzero):\n")
+print(nz[nz > 1e-4])
@@ -17,6 +17,7 @@ required_packages <- c(
   "DIDHAD",        # de Chaisemartin et al. (2025) HAD estimator (HAD Phase 4 R-parity)
   "YatchewTest",   # Yatchew (1997) linearity test (HAD yatchew R-parity)
   "nprobust",      # Calonico-Cattaneo-Farrell local-linear (DIDHAD dependency)
+  "Synth",         # Abadie-Diamond-Hainmueller (2010) synthetic control (SyntheticControl R-parity; ships data(basque))
 
   # Utilities
   "jsonlite",      # JSON output for Python interop
 
@@ -222,6 +222,11 @@
     TROPResults,
     trop,
 )
+from diff_diff.synthetic_control import (
+    SyntheticControl,
+    synthetic_control,
+)
+from diff_diff.synthetic_control_results import SyntheticControlResults
 from diff_diff.wooldridge import WooldridgeDiD
 from diff_diff.wooldridge_results import WooldridgeDiDResults
 from diff_diff.utils import (
@@ -309,6 +314,7 @@
     "SpilloverDiD",
     "TripleDifference",
     "TROP",
+    "SyntheticControl",
     "StackedDiD",
     # Estimator aliases (short names)
     "DiD",
@@ -355,6 +361,8 @@
     "StaggeredTripleDiffResults",
     "TROPResults",
     "trop",
+    "SyntheticControlResults",
+    "synthetic_control",
     "StackedDiDResults",
     "stacked_did",
     # EfficientDiD
 
@@ -8,6 +8,7 @@
 Additional estimators are in separate modules:
 - TwoWayFixedEffects: See diff_diff.twfe
 - SyntheticDiD: See diff_diff.synthetic_did
+- SyntheticControl: See diff_diff.synthetic_control
 
 For backward compatibility, all estimators are re-exported from this module.
 """
@@ -2042,6 +2043,8 @@ def summary(self) -> str:
 # These can also be imported directly from their respective modules:
 # - from diff_diff.twfe import TwoWayFixedEffects
 # - from diff_diff.synthetic_did import SyntheticDiD
+# - from diff_diff.synthetic_control import SyntheticControl
+from diff_diff.synthetic_control import SyntheticControl  # noqa: E402
 from diff_diff.synthetic_did import SyntheticDiD  # noqa: E402
 from diff_diff.twfe import TwoWayFixedEffects  # noqa: E402
 
@@ -2050,4 +2053,5 @@ def summary(self) -> str:
     "MultiPeriodDiD",
     "TwoWayFixedEffects",
     "SyntheticDiD",
+    "SyntheticControl",
 ]