synthetic-control: address CI codex R2 — poor-fit warning fires on flat pre-path (P2)

igerber · claude · igerber · commit 95cd4f9efa00 · 2026-05-30T18:10:59.000-04:00
The poor-fit warning was gated by `pre_sd &gt; 0`, so a FLAT treated pre-period path
(SD == 0) with nonzero pre-RMSPE never warned even though the synthetic clearly fails
to reproduce a constant series. Change the gate to the literal REGISTRY contract
(warn when pre_rmspe &gt; pre_sd), including the SD == 0 case, with a scale-aware absolute
floor (1e-8 * max(|Z1|, 1)) so a near-perfect flat fit (RMSPE ~ roundoff) does not
spuriously warn. REGISTRY poor-fit Note updated to document the flat-path behavior
(slightly broader than SyntheticDiD's SD&gt;0-gated form). Regression:
test_poor_fit_warning_flat_treated_pre_path.

Co-Authored-By: Claude Opus 4.8 (1M context) &lt;noreply@anthropic.com&gt;
diff --git a/diff_diff/synthetic_control.py b/diff_diff/synthetic_control.py
@@ -424,9 +424,14 @@ def fit(
         pre_rmspe = float(np.sqrt(np.mean(pre_gaps**2)))
         att = float(np.mean(post_gaps))
 
-        # Poor-fit warning (REGISTRY contract; mirrors synthetic_did.py).
+        # Poor-fit warning (REGISTRY contract: warn when pre-RMSPE exceeds the SD of
+        # the treated unit's pre-period outcomes). This includes a FLAT treated pre-path
+        # (pre_sd == 0): any non-trivial RMSPE then means the synthetic cannot reproduce
+        # a constant series. A scale-aware absolute floor (`_fit_tol`) guards against a
+        # spurious warning on a near-perfect flat fit (RMSPE ~ roundoff).
         pre_sd = float(np.std(Z1, ddof=1)) if Z1.size > 1 else 0.0
-        if pre_sd > 0 and pre_rmspe > pre_sd:
+        _fit_tol = 1e-8 * max(float(np.max(np.abs(Z1))) if Z1.size else 0.0, 1.0)
+        if pre_rmspe > pre_sd and pre_rmspe > _fit_tol:
             warnings.warn(
                 f"Pre-treatment fit is poor: RMSPE ({pre_rmspe:.4f}) exceeds the "
                 f"standard deviation of treated pre-treatment outcomes ({pre_sd:.4f}). "
diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md
@@ -1979,7 +1979,7 @@ Classic synthetic control (donor/unit weights only) for a single treated unit, d
 - **Note:** The standardization divisor `divisor = sqrt(apply(cbind(X0,X1), 1, var))` (per-predictor SD over donors+treated, ddof=1) and the inner/outer optimizer are **not specified in ADH 2010** (which defers these numerics to Abadie & Gardeazabal 2003 App. B / the `Synth` software). The divisor is pinned from the R `Synth::synth` source; `solution.v` lives in this scaled predictor space, so the deterministic R-parity test feeds `custom_v` in the same scaled space.
 - **Note:** The outer objective minimizes the pre-period outcome MSPE over **all** pre periods, whereas R `Synth` uses a `time.optimize.ssr` window (1960–1969 in the Basque example). The nested `V` therefore differs from R by an efficiency-only choice (the paper notes inferential validity holds for *any* `V`), so end-to-end nested parity is a tolerance band, not equality.
 - **Note:** `V` is parametrized on the unit simplex via a softmax of an unconstrained vector (trace-normalization is identification-fixing, not a constraint loss); the multistart Nelder-Mead + derivative-free Powell polish approximates R's best-of-`optimx` behavior over the non-smooth outer objective.
-- **Note:** The 1×SD poor-fit threshold is a defensive implementation choice matching the `SyntheticDiD` convention; ADH 2010 gives only the qualitative guidance "do not use SCM when the fit is poor" (no numeric cutoff).
+- **Note:** The 1×SD poor-fit threshold is a defensive implementation choice in the spirit of the `SyntheticDiD` convention; ADH 2010 gives only the qualitative guidance "do not use SCM when the fit is poor" (no numeric cutoff). The warning fires whenever pre-period RMSPE exceeds the SD of the treated unit's pre-period outcomes — **including a flat treated pre-path** (`SD = 0`) with non-trivial RMSPE (a scale-aware roundoff floor suppresses the warning on a near-perfect flat fit). This is slightly broader than `SyntheticDiD`'s `SD > 0`-gated form, matching the literal RMSPE-exceeds-SD contract above.
 - **Deviation from R:** `standardize="none"` disables predictor standardization entirely; R `Synth` always scales by the predictor SD. Provided for diagnostics; changes the geometry of the `V` objective.
 - **Note:** predictor rows support only **equal-weight** linear combinations of pre-period values — `mean` (`k_s = 1/T0`), `sum` (`k_s = 1`), and per-period outcome lags (identity, a single `k_s = 1`). ADH (2010) §2.3 defines the general form `Ȳ_i^{K_m} = Σ_s k_s Y_is` with *arbitrary* weights `k_s`; this release does NOT accept user-supplied non-uniform `K_m` weight vectors (and `median` and other non-linear aggregations are intentionally excluded). The supported set still spans the standard `Synth::dataprep` `predictors.op` + `special.predictors` usage; arbitrary-weight `K_m` is a deferred extension.
 - **Deviation from R:** predictor/outcome **aggregation fails closed on any non-finite (NaN/inf) cell**, whereas R `Synth::dataprep` hardwires `na.rm=TRUE` (aggregating over the observed cells of a partially-missing window). The fail-closed contract is deliberate: na-dropping silently aggregates different period subsets across units, yielding incomparable predictors with no warning. The analyst must restrict `predictor_window` / `special_predictors` / `pre_period_outcomes` periods (and the outcome panel) to where each variable is observed; both partially- and fully-missing windows raise `ValueError`. Only the row *ordering* matches `dataprep`, not the missing-data handling.
diff --git a/tests/test_methodology_synthetic_control.py b/tests/test_methodology_synthetic_control.py
@@ -416,6 +416,25 @@ def test_poor_fit_warning():
         synthetic_control(df, "y", "treated", "unit", "year", seed=0)
 
 
+def test_poor_fit_warning_flat_treated_pre_path():
+    # Flat treated pre-path (SD == 0) that donors near 10 cannot reproduce: RMSPE > 0
+    # must still warn (the former `pre_sd > 0` gate suppressed this case).
+    rng = np.random.default_rng(2)
+    years = list(range(2000, 2010))
+    T0 = 7
+    rows = []
+    for j in range(4):
+        for yr in years:
+            rows.append({"unit": f"d{j}", "year": yr, "y": 10 + rng.normal(0, 0.1), "treated": 0})
+    for i, yr in enumerate(years):
+        rows.append(
+            {"unit": "treated", "year": yr, "y": (5.0 if i < T0 else 8.0), "treated": int(i >= T0)}
+        )
+    df = pd.DataFrame(rows)
+    with pytest.warns(UserWarning, match="Pre-treatment fit is poor"):
+        synthetic_control(df, "y", "treated", "unit", "year", seed=0)
+
+
 # ---------------------------------------------------------------------------
 # Validation 7: duplicate predictor labels rejected
 # ---------------------------------------------------------------------------