From b04bf2bf73f907657db2125f5e2a7dca15a18c86 Mon Sep 17 00:00:00 2001
From: igerber <isaac.gerber@gmail.com>
Date: Mon, 1 Jun 2026 19:49:33 -0400
Subject: [PATCH] HAD fit(): extensive-margin warning + covariates=
 NotImplementedError

Two fit-time UX additions to HeterogeneousAdoptionDiD.fit() with NO change to
any estimate or standard error (TODO L73 + L74):

- Overall path emits a UserWarning when >=10% of units have an exactly-zero
  post-period dose (library-convention cutoff _HAD_EXTENSIVE_MARGIN_ZERO_DOSE_FRAC).
  A substantial untreated mass suggests a genuine extensive margin where a
  standard DiD may be preferable (de Chaisemartin et al. 2026, Section 2 /
  Assumption 3). The paper retains small untreated shares (Garrett et al.
  12/2954 ~ 0.4%), so the cutoff sits ~25x above that. Overall-path-only: the
  event-study path REQUIRES never-treated units per Appendix B.2. Closes
  paper-review checklist L191.

- fit(covariates=...) raises NotImplementedError via an explicit keyword-only
  param, pointing to the deferred Appendix B.1 / Theorem 6 covariate-adjusted
  extension, instead of a bare TypeError from an unknown kwarg.

REGISTRY HeterogeneousAdoptionDiD documents both as library Notes + ticks the
covariates implementation-checklist item; the new covariates param is
propagated to the llms-full.txt guide signature block (pinned by
tests/test_guides.py). New behavioral tests in test_had.py + deviation locks
in TestHADDeviations. Retires the two TODO rows.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 CHANGELOG.md                                  |   1 +
 TODO.md                                       |   2 -
 diff_diff/guides/llms-full.txt                |   1 +
 diff_diff/had.py                              |  77 ++++++++++
 docs/methodology/REGISTRY.md                  |   4 +-
 .../papers/dechaisemartin-2026-review.md      |   2 +-
 tests/test_had.py                             | 133 ++++++++++++++++++
 tests/test_methodology_had.py                 |  64 +++++++++
 8 files changed, 280 insertions(+), 4 deletions(-)

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 09f669db..27c7aeb9 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ### Added
 - **`SyntheticControl` cross-validation + inverse-variance `V`-selection (ADH 2015 §; Abadie 2021 §3.2(a), Eq. 9).** Two new `v_method` values complete the ADH-2015/Abadie-2021 `V`-selection menu (joining `"nested"` / `"custom"`), each threaded through the in-space / leave-one-out / in-time placebo refits so a diagnostic uses the **same** estimator as the headline fit. **`v_method="cv"`** selects the diagonal predictor-importance `V` by out-of-sample cross-validation: the pre-period is split positionally at `v_cv_t0` (new constructor param; default `len(pre)//2`, Abadie 2021's `t0 = T0/2`) into a training and a validation window, `V` is chosen to minimize the validation-window outcome MSPE of the training-fit weights (`mspe_v` now reports this validation MSPE under cv), and the final reported weights are re-estimated on the validation-window predictors (ADH 2015 step 4). Each predictor spec is **re-aggregated** over each window (its mean/sum/identity recomputed over only the periods that fall in that window — a separate `dataprep` per window, exactly as ADH 2015's CV does, since R `Synth` has no built-in CV function), so the V-search is genuinely out-of-sample for every predictor type and the same `V*` drives both fits with no zeroed coordinate (`v_weights` reproduce `donor_weights` on the validation-window predictors, and `predictor_balance` is reported on that validation-window basis). **Fully-spanning precondition (fail-closed):** re-aggregating a predictor on each window requires it to be observed in **both** windows, so `cv` **requires every predictor to span both the training and validation windows** and raises `ValueError` otherwise — satisfied by ADH 2015's shared covariate / multi-period `special_predictors` (which span the windows) but NOT by the default per-period outcome lags (each is single-period and lives in one window only), so `cv` with the bare default predictors is rejected with guidance to pass spanning predictors. In-time-placebo truncation that breaks the fully-spanning precondition (a kept spec stops spanning both windows at the truncated split) marks that date `infeasible`. A second fail-closed gate covers windows that span but carry **no cross-donor variation** (every re-aggregated predictor constant across the donors, so `X0·W` is constant in `W` → a flat, unidentified weight solve that would otherwise return arbitrary "converged" weights — even when the treated unit differs, since donor distinguishability, not treated-vs-donor variation, identifies `W`): the headline fit raises `ValueError`, in-space placebo refits whose donor pool is indistinguishable in a window are dropped from the reference set, and such in-time-truncated dates are marked `infeasible`. Abadie 2021 footnote 7's CV non-uniqueness is handled by a **deterministic tie-break** (prefer the `V` closest to uniform among ties), making the selected `V*` among equally-good optima independent of the multistart evaluation order. The cv fit is reproducible for a fixed `seed` (like `nested`) but is not seed-independent — the multistart fills any slots beyond the distinct heuristic starts with seed-dependent random Dirichlet draws, so the tie-break removes start-order dependence among ties, not seed dependence. The tie-break is convergence-aware (a non-converged optimizer candidate cannot displace a converged incumbent on an objective tie). If the training-window solve that defines `mspe_v` truncates (e.g. `inner_max_iter` too small), the fit fails closed — `mspe_v=NaN` and the fit is marked non-converged — rather than reporting an invalid Eq. 9 criterion. **`v_method="inverse_variance"`** uses the closed form `v_h = 1/Var(X_h)` (variance over donors+treated on the unstandardized predictors), applied to the **raw** predictors so the effective objective is the unit-variance-rescaled `Σ_h diff_h²/Var_h` (Abadie 2021 §3.2(a)); the `standardize` pre-scaling is intentionally bypassed on this branch (inverse-variance weighting *is* the unit-variance rescaling — applying it on already-standardized rows would double-rescale to `Σ_h diff_h²/Var_h²`), so it is equivalent to uniform `V` on standardized predictors. No search (`mspe_v=None`); a zero-variance row gets 0 weight and an all-zero-variance panel falls back to uniform `V` with a warning. `custom_v` is rejected (fail-closed) for both methods and `v_cv_t0` is rejected unless `v_method="cv"`. On the degenerate **single-donor** path (`J=1` forces `w=[1]`) `V` is unidentified — every `V` yields the same synthetic — so `v_weights` is **uniform** and `mspe_v=None` for ALL `v_method`s (cv / inverse_variance included; their selected / closed-form `V` would be inert), with a `UserWarning`; the donor weights / gap / ATT are unaffected. An explicitly pinned `v_cv_t0` that no longer fits the truncated pre-fake window is nulled to the `//2` default for the placebo refit (a pinned value that still fits the truncated window is kept). **Validation:** R `Synth` has no built-in CV function (ADH 2015's CV is a manual `dataprep`+`synth` re-run), so cv is anchored by deterministic equivalence to the R-anchored `custom_v` path (the step-3 validation MSPE of the training-window fit and the step-4 validation-window weights each match a `custom_v=V*` fit on the correspondingly re-aggregated predictors) plus cv self-consistency (`in_time_placebo` under cv == a fresh cv fit on the backdated panel to 1e-7); inverse-variance is anchored bit-for-bit to a `custom_v=1/Var(X)` fit. Documented in `docs/methodology/REGISTRY.md` §SyntheticControl (new `**Note:**` labels for the per-window re-aggregation convention, the flat-MSPE tie-break, and inverse-variance), `docs/api/synthetic_control.rst`, the LLM guides, and `README.md`. The remaining ADH-2015 items (`W^reg` extrapolation diagnostic, sparse-SC subset search) stay tracked in `TODO.md`.
 - **Firpo & Possebom (2018) SCM inference paper review on file (PR-A).** Added `docs/methodology/papers/firpo-possebom-2018-review.md`, a faithful, paper-sourced fidelity review of Firpo & Possebom (2018, *Journal of Causal Inference* 6(2), DOI 10.1515/jci-2016-0026) — the Step-1 artifact for the forthcoming SCM **confidence-set / CI-by-test-inversion** track (PR-B) layered on the existing `SyntheticControl` estimator (classic SCM has no analytical SE; `se`/`p_value`/`conf_int` are NaN). Transcribes (paper-sourced only, no code-deviation verdicts) the benchmark RMSPE-ratio permutation test (Eqs. 4–6), the sensitivity-analysis parametric p-value weights with worst/best-case `φ̲`/`φ̄` (Eqs. 7–9), the sharp-null `RMSPE^f` test (Eqs. 10–13), the **confidence sets by test inversion** (Eq. 14) with the operational constant-effect CI (Eqs. 15–16) and linear-effect CS (Eqs. 17–18), the general test-statistic framework + Monte Carlo size/power of five statistics (Eq. 19, Section 5), and the multiple-outcome FWER (Eqs. 23–24) and multiple-treated-unit pooled (Eqs. 25–26) extensions; the requirements checklist flags the PR-B target (sharp-null test + constant/linear CI + benchmark + one-sided) versus the deferred sensitivity-analysis and multi-outcome/treated extensions. Docs-only; no code change. Registered in `docs/references.rst` (Synthetic Control Method section) and `docs/doc-deps.yaml`; REGISTRY `## SyntheticControl` gains a `firpo-possebom-2018-review.md` reviews-on-file pointer.
+- **`HeterogeneousAdoptionDiD.fit()` fit-time extensive-margin warning + `covariates=` not-implemented pointer.** Two UX additions to the HAD `fit()` surface, with **no change to any estimate or standard error**. (1) The **overall** path now emits a `UserWarning` when a non-trivial fraction (`>= 10%`, a library-convention cutoff in `_HAD_EXTENSIVE_MARGIN_ZERO_DOSE_FRAC`) of units have an exactly-zero post-period dose — a genuine untreated mass for which a standard DiD using those units as controls may be more appropriate (de Chaisemartin et al. 2026, Section 2 / Assumption 3). The paper retains *small* untreated shares (e.g. 12/2954 in Garrett et al., with close-to-nominal coverage), so the 10% cutoff sits ~25× above that; the warning is **overall-path-only** because the event-study path *requires* never-treated units per Appendix B.2. Previously the recommendation surfaced only via `qug_test()`'s zero-dose warning when the user ran the pre-tests. (2) `HeterogeneousAdoptionDiD.fit(covariates=...)` now raises `NotImplementedError` with a pointer to the deferred Appendix B.1 / Theorem 6 covariate-adjusted extension (via an explicit keyword-only `covariates=` param) instead of a bare `TypeError` from an unknown kwarg; pre-residualize the outcome on the covariates as a workaround. Documented in `docs/methodology/REGISTRY.md` §HeterogeneousAdoptionDiD; new tests in `tests/test_had.py` and `tests/test_methodology_had.py`.
 
 ### Fixed
 - **Covariate names that collide with reserved structural terms now raise `ValueError` instead of silently corrupting the coefficient dict (`DifferenceInDifferences`, `MultiPeriodDiD`, `TwoWayFixedEffects`).** These estimators build their `coefficients` dict by zipping a variable-name list -- structural term names PLUS the user covariate column names appended verbatim -- with the fitted coefficient vector. A covariate whose name equaled a reserved structural name (`const`; the treatment/time column names; the `{treatment}:{time}` interaction; MultiPeriodDiD `period_{p}` dummies and `{treatment}:period_{p}` interactions; `TwoWayFixedEffects` `ATT`; fixed-effect / unit / time dummy names; or an internal `_`-prefixed working column such as `_treat_time` / `_did_treatment` / `_treatment_post`) silently **overwrote** that structural coefficient via Python dict last-write-wins -- e.g. a covariate named `const` dropped the intercept -- with no error or warning. A new shared `validate_covariate_names` helper (`diff_diff/utils.py`) is now called in each of the three `fit()` methods before the design matrix is built; it raises `ValueError` on a collision (the comparison is case-sensitive, so e.g. `Const` is still allowed) **and** on duplicate names within `covariates` (which collapse to a single dict entry the same way). Fixed-effect/unit/time dummy reserved names are taken from the same `pd.get_dummies(..., drop_first=True)` call used to build them, so they match exactly (including for pandas `Categorical` columns with a non-default category order). For `TwoWayFixedEffects` the guard fires on **all** variance paths: the default within-transform path returns only `{"ATT": att}` (no covariate is a dict key there), but a covariate named `_treatment_post` would still clobber the internal interaction column, so guarding both paths is uniform and forward-compatible. **Potentially breaking:** a fit that previously *succeeded* with a colliding (or duplicated) covariate name -- silently returning a corrupted coefficient dict -- now raises; rename the covariate column(s). The staggered / influence-function estimators (CallawaySantAnna, SunAbraham, StaggeredTripleDifference, EfficientDiD, TwoStageDiD, ImputationDiD, WooldridgeDiD, dCDH, StackedDiD) key results by `(g, t)` tuples / relative-time indices, never covariate names, and `TripleDifference` / `SyntheticControl` / `SyntheticDiD` do not expose covariates by name, so none are affected. New tests in `tests/test_utils.py`, `tests/test_estimators.py`, and `tests/test_estimators_vcov_type.py`.
diff --git a/TODO.md b/TODO.md
index bba75ee9..c2fb6662 100644
--- a/TODO.md
+++ b/TODO.md
@@ -139,8 +139,6 @@ Deferred items from PR reviews that were not addressed before merge.
 | `HeterogeneousAdoptionDiD` Phase 3 R-parity: Phase 3 ships coverage-rate validation on synthetic DGPs (not tight point parity against `chaisemartin::stute_test` / `yatchew_test`). Tight numerical parity requires aligning bootstrap seed semantics and `B` across numpy/R and is deferred. | `tests/test_had_pretests.py` | Phase 3 | Low |
 | `HeterogeneousAdoptionDiD` Phase 3 nprobust bandwidth for Stute: some Stute variants on continuous regressors use nprobust-style optimal bandwidth selection. Phase 3 uses OLS residuals from a 2-parameter linear fit (no bandwidth selection). nprobust integration is a future enhancement; not in paper scope. | `diff_diff/had_pretests.py::stute_test` | Phase 3 | Low |
 | `HeterogeneousAdoptionDiD` Phase 4: Pierce-Schott (2016) replication harness; reproduce paper Figure 2 values and Table 1 coverage rates. **Waived in tracker-promotion PR (2026-05-20):** R parity at `atol=1e-8` on the same 3 DGPs (`tests/test_did_had_parity.py`) is a strictly stronger correctness anchor than reproducing Figure 2's pointwise CIs on the LBD-restricted PNTR panel; paper Section 5.2 self-acknowledges NP estimators too noisy to be informative there. Table 1 coverage-rate MC would re-verify the CCF asymptotic coverage already pinned by R parity (Python ≡ R ≡ paper). See REGISTRY HAD Deviations Notes #3 / #4 for full scope-caveat statements. Re-open if user demand emerges for an empirical-application replication harness. | `benchmarks/`, `tests/` | Phase 2a | Low |
-| `HeterogeneousAdoptionDiD` `covariates=` kwarg with Theorem 6 multivariate-covariate extension: current behavior is a Python `TypeError` (the `covariates=` kwarg is absent from `HAD.fit()` signature) — fail-closed, but doesn't surface the Theorem 6 future-work pointer to the user. Add an explicit `**kwargs`-trap with `NotImplementedError` and a Theorem 6 / `nprobust` multivariate-NP-regression pointer. ~10 LoC follow-up. | `diff_diff/had.py::HeterogeneousAdoptionDiD.fit` | follow-up | Low |
-| `HeterogeneousAdoptionDiD` extensive-margin / positive-mass-of-untreated warning on the main `fit()` path. Paper recommends warning users with positive zero-dose mass that standard DiD may be more appropriate. Currently surfaced via the `qug_test()` zero-dose `UserWarning` (which only fires when the user runs pre-tests). Add a fit-time `UserWarning` when the panel's post-period dose contains a non-trivial fraction at exactly zero, with a "consider running standard DiD" pointer. Paper-review checklist L191 in `dechaisemartin-2026-review.md` left unchecked pending this addition. | `diff_diff/had.py::HeterogeneousAdoptionDiD.fit` | follow-up | Low |
 | `HeterogeneousAdoptionDiD` time-varying dose on event study: Phase 2b REJECTS panels where `D_{g,t}` varies within a unit for `t >= F` (the aggregation uses `D_{g, F}` as the single regressor for all horizons, paper Appendix B.2 constant-dose convention). A follow-up PR could add a time-varying-dose estimator for these panels; current behavior is front-door rejection with a redirect to `ChaisemartinDHaultfoeuille`. | `diff_diff/had.py::_validate_had_panel_event_study` | Phase 2b | Low |
 | `HeterogeneousAdoptionDiD` repeated-cross-section support: paper Section 2 defines HAD on panel OR repeated cross-section, but Phase 2a is panel-only. RCS inputs (disjoint unit IDs between periods) are rejected by the balanced-panel validator with the generic "unit(s) do not appear in both periods" error. A follow-up PR will add an RCS identification path based on pre/post cell means (rather than unit-level first differences), with its own validator and a distinct `data_mode` / API surface. | `diff_diff/had.py::_validate_had_panel`, `diff_diff/had.py::_aggregate_first_difference` | Phase 2a | Medium |
 | SyntheticDiD: bootstrap cross-language parity anchor against R's default `synthdid::vcov(method="bootstrap")` (refit; rebinds `opts` per draw) or Julia `Synthdid.jl::src/vcov.jl::bootstrap_se` (refit by construction). Same-library validation (placebo-SE tracking, AER §6.3 MC truth) is in place; a cross-language anchor is desirable to bolster the methodology contract. Julia is the cleanest target — minimal wrapping work and refit-native vcov. Tolerance target: 1e-6 on Monte Carlo samples (different BLAS + RNG paths preclude 1e-10). The R-parity fixture from the previous release was deleted because it pinned the now-removed fixed-weight path. | `benchmarks/R/`, `benchmarks/julia/`, `tests/` | follow-up | Low |
diff --git a/diff_diff/guides/llms-full.txt b/diff_diff/guides/llms-full.txt
index 6ccf4a04..b4a89596 100644
--- a/diff_diff/guides/llms-full.txt
+++ b/diff_diff/guides/llms-full.txt
@@ -763,6 +763,7 @@ had.fit(
     *,
     survey_design: SurveyDesign | None = None, # Canonical survey-design kwarg (weights, strata, PSU, FPC)
     trends_lin: bool = False,                  # Eq 17 linear-trend detrending. Requires aggregate="event_study"; needs F>=3 (pre-period depth) for the regression; rejects ALL weighting entry paths (survey_design= / survey= / weights= all raise NotImplementedError under trends_lin).
+    covariates: Any | None = None,             # NOT IMPLEMENTED — non-None raises NotImplementedError (deferred Appendix B.1 / Theorem 6 covariate-adjusted extension; pre-residualize the outcome on covariates as a workaround)
 ) -> HeterogeneousAdoptionDiDResults | HeterogeneousAdoptionDiDEventStudyResults
 ```
 
diff --git a/diff_diff/had.py b/diff_diff/had.py
index a3881a8f..47b63c62 100644
--- a/diff_diff/had.py
+++ b/diff_diff/had.py
@@ -121,6 +121,17 @@
 _MASS_POINT_VCOV_SUPPORTED = ("classical", "hc1")
 _MASS_POINT_VCOV_UNSUPPORTED = ("hc2", "hc2_bm")
 
+# Extensive-margin / positive-untreated-mass warning (TODO L74). The paper (de
+# Chaisemartin et al. 2026, Section 2 / Assumption 3) defines HAD for the case
+# where no genuine untreated group exists, and recommends users with a real
+# untreated mass consider a standard DiD instead. The paper prescribes "warn"
+# but NO numeric cutoff, and explicitly RETAINS small untreated shares (Garrett
+# et al.: 12/2954 ~ 0.4%, with nominal coverage), so this fit-time UserWarning
+# fires only above a library-convention fraction of EXACTLY-zero post-period
+# doses. Overall path ONLY — the event-study path requires never-treated units
+# per Appendix B.2, so an untreated mass is expected there, not a misuse signal.
+_HAD_EXTENSIVE_MARGIN_ZERO_DOSE_FRAC = 0.10
+
 # Target-parameter label per design. Design 1' targets the WAS (Assumption 3);
 # Design 1 targets WAS_{d_lower} (Assumption 5 or 6), which also applies to
 # the mass-point path (paper Section 3.2.4).
@@ -2844,6 +2855,7 @@ def fit(
         *,
         survey_design: Any = None,
         trends_lin: bool = False,
+        covariates: Any = None,
     ) -> Union[HeterogeneousAdoptionDiDResults, HeterogeneousAdoptionDiDEventStudyResults]:
         """Fit the HAD estimator.
 
@@ -2973,6 +2985,14 @@ def fit(
             ``survey`` / ``weights``); raises ``NotImplementedError``
             if combined. Default ``False`` preserves bit-exact
             backcompat with all pre-PR fits.
+        covariates : array-like or None, default None, keyword-only
+            NOT YET IMPLEMENTED. Reserved for the covariate-adjusted HAD
+            identification of de Chaisemartin et al. (2026), Appendix B.1 /
+            Theorem 6 (the multivariate-covariate extension). A non-None
+            value raises ``NotImplementedError`` with a pointer to that
+            extension; pre-residualize the outcome on the covariates before
+            calling ``fit()``, or omit ``covariates=`` for the unconditional
+            WAS estimand.
 
         Returns
         -------
@@ -2984,6 +3004,15 @@ def fit(
             staggered panels auto-filters to the last cohort plus
             never-treated): per-event-time WAS estimates with per-
             horizon arrays.
+
+        Notes
+        -----
+        On the ``aggregate="overall"`` path, ``fit()`` emits a ``UserWarning``
+        when a non-trivial fraction (``>= 10%``, a library convention) of
+        units have exactly-zero post-period dose — a genuine untreated mass
+        for which a standard DiD may be more appropriate (de Chaisemartin
+        et al. 2026, Section 2). The event-study path does not warn: it
+        *requires* never-treated units per Appendix B.2.
         """
         # ---- aggregate / survey_design / survey / weights validation ----
         if aggregate not in _VALID_AGGREGATES:
@@ -2995,6 +3024,26 @@ def fit(
         if n_set > 1:
             raise ValueError(HAD_DUAL_KNOB_MUTEX_MSG_DATA_IN)
 
+        # ---- covariates= future-work trap (TODO L73) ----
+        # Covariate-adjusted HAD identification (de Chaisemartin et al. 2026,
+        # Appendix B.1 / Theorem 6 — the multivariate-covariate extension) is not
+        # implemented. An explicit param + NotImplementedError surfaces the roadmap
+        # (vs the bare TypeError a missing kwarg would raise) while leaving
+        # unknown-kwarg typos a normal TypeError. Placed after the survey/weights
+        # mutex and before the event-study dispatch so the single raise covers BOTH
+        # aggregate="overall" and aggregate="event_study".
+        if covariates is not None:
+            raise NotImplementedError(
+                "HeterogeneousAdoptionDiD.fit(covariates=...) is not yet "
+                "implemented. Covariate-adjusted HAD identification (de "
+                "Chaisemartin et al. 2026, Appendix B.1 / Theorem 6 — the "
+                "multivariate-covariate extension) requires a multivariate "
+                "nonparametric regression of dY on (D, X) at the dose boundary, "
+                "which is not derived here. Pre-residualize the outcome on the "
+                "covariates before calling fit(), or omit covariates= for the "
+                "unconditional WAS estimand."
+            )
+
         # ---- trends_lin scope gates (PR #389 / Phase 4 R-parity).
         # `trends_lin=True` implements paper Eq 17 linear-trend detrending
         # (per-group slope from Y[F-1]-Y[F-2], applied to per-event-time
@@ -3129,6 +3178,34 @@ def fit(
             None,
         )
 
+        # ---- Extensive-margin / positive-untreated-mass warning (TODO L74) ----
+        # d_arr is the per-unit post-period dose D_{g,2} (D_{g,1}=0, so dD = D_2);
+        # exactly-zero entries are genuinely untreated units. The `== 0.0` test
+        # mirrors the qug_test `d == 0` convention (had_pretests.py). Fraction-only
+        # (no absolute floor); fires at/above the library-convention cutoff. See the
+        # _HAD_EXTENSIVE_MARGIN_ZERO_DOSE_FRAC definition for the paper rationale.
+        # This runs on the overall path only: the event-study dispatch returns
+        # above, and the event-study path requires never-treated units (App. B.2).
+        n_zero = int((d_arr == 0.0).sum())
+        if n_zero and n_zero / d_arr.shape[0] >= _HAD_EXTENSIVE_MARGIN_ZERO_DOSE_FRAC:
+            frac_zero = n_zero / d_arr.shape[0]
+            warnings.warn(
+                f"{n_zero}/{d_arr.shape[0]} units ({frac_zero:.0%}) have exactly-"
+                f"zero post-period dose. HeterogeneousAdoptionDiD targets a "
+                f"Weighted Average Slope under the assumption that all units "
+                f"receive a positive, heterogeneous dose with no genuine control "
+                f"group (de Chaisemartin et al. 2026, Section 2 / Assumption 3). A "
+                f"substantial untreated mass suggests a genuine extensive margin, "
+                f"where a standard DiD using the untreated units as controls may be "
+                f"more appropriate. (The paper retains small untreated shares — "
+                f"e.g. 12/2954 in Garrett et al. — with nominal coverage; this "
+                f"warning fires only at/above a "
+                f"{_HAD_EXTENSIVE_MARGIN_ZERO_DOSE_FRAC:.0%} library-convention "
+                f"cutoff.)",
+                UserWarning,
+                stacklevel=2,
+            )
+
         # Resolve survey/weights into per-unit weights + optional
         # ResolvedSurveyDesign (for PSU/strata/FPC composition).
         # - `weights=<array>` → per-row array, no PSU/strata composition.
diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md
index 859541fc..8da65b5a 100644
--- a/docs/methodology/REGISTRY.md
+++ b/docs/methodology/REGISTRY.md
@@ -2929,6 +2929,8 @@ Shipped in `diff_diff/had_pretests.py` as `stute_joint_pretest()` (residuals-in
 - **Note:** Pierce-Schott (2016) Figure 2 replication harness deferred. The paper's empirical application self-acknowledges (Section 5.2; mirrored in `dechaisemartin-2026-review.md:321`) that "NP estimators are too noisy to be informative" on the LBD-restricted PNTR panel. R parity at `atol=1e-8` on 3 DGPs × 5 method combos via `tests/test_did_had_parity.py` (bit-exact, `rtol=0`) is a stronger correctness anchor than reproducing pointwise CIs on LBD-restricted data. **Scope caveat:** R parity locks point estimate, SE, and CI bounds bit-exactly to R's bounds — it does NOT independently verify the asymptotic-coverage properties of the bias-corrected CI in small samples. Paper Table 1 documents under-coverage at small G (89% at G=100 on DGP 1, 93% at G=500, 95% at G=2500); this is inherited from the CCF asymptotic theory itself, and Python is exact-parity with R at the limit-law machinery.
 - **Note:** Table 1 coverage-rate reproduction deferred. Paper Section 3.1.5 reports 2,000-iter Monte Carlo coverage rates at `G ∈ {100, 500, 2500}` on DGPs 1/2/3. The existing `tests/test_did_had_parity.py` R parity at `atol=1e-8` on the same 3 DGPs reproduces the exact point estimate and SE algorithm to bit-exact tolerance; coverage-rate MC would re-verify the CCF asymptotic coverage already pinned by R parity (Python ≡ R ≡ paper) at the sample-mean level. **Scope caveat (mirrors above):** R parity does NOT re-prove asymptotic-coverage at small G; paper Table 1's 89% / 93% / 95% under-coverage band is valid for both R and Python.
 - **Library extension:** Staggered-timing fail-closed. Paper Appendix B.2 prescribes "Warn" when staggered treatment timing is detected; library raises `ValueError` at `diff_diff/had.py:1511` when multiple first-treat cohorts are detected without `first_treat_col`. Library extension toward stricter safety: `UserWarning` would let the silent-misuse bug class through (HAD's Appendix B.2 only identifies the LAST cohort under staggered timing); fail-closed forces the user to either supply `first_treat_col` (which activates auto-filter to last-cohort + never-treated per Appendix B.2) or redirect to `ChaisemartinDHaultfoeuille` (`did_multiplegt_dyn`). Lock in `tests/test_methodology_had.py::TestHADDeviations`.
+- **Note:** Extensive-margin / positive-untreated-mass fit-time warning (library convention). The paper (de Chaisemartin et al. 2026, Section 2 / Assumption 3) defines HAD for the case where no genuine untreated group exists and recommends (Section 4 practitioner checklist) that a user with a positive mass of untreated units consider a standard DiD instead — but it prescribes only "warn" with NO numeric cutoff, and explicitly RETAINS small untreated shares (the Garrett et al. bonus-depreciation application keeps 12 untreated counties out of 2,954 ≈ 0.4%, with simulations showing close-to-nominal coverage even at `f_{D_2}(0) = 0`). The library therefore emits a `UserWarning` at `HeterogeneousAdoptionDiD.fit()` time only when the fraction of units with EXACTLY-zero post-period dose is `>= 0.10` (`_HAD_EXTENSIVE_MARGIN_ZERO_DOSE_FRAC` in `diff_diff/had.py`) — a 10% library-convention cutoff chosen to sit ~25× above the paper's kept 0.4% example, so valid small-share fits are not nagged while a substantial untreated mass is flagged. **Overall path only:** the warning is emitted after the `aggregate="event_study"` dispatch returns, because the event-study path REQUIRES never-treated (zero-dose) units per Appendix B.2 (the last-cohort filter retains them), so an untreated mass is expected there, not a misuse signal. Surfaces the recommendation at fit time rather than only via `qug_test()`'s zero-dose `UserWarning` (which fires only when the user runs the pretests). Lock in `tests/test_methodology_had.py::TestHADDeviations::test_extensive_margin_warning_is_10pct_library_convention`.
+- **Note:** `covariates=` is reserved but NOT implemented. `HeterogeneousAdoptionDiD.fit(covariates=...)` raises `NotImplementedError` — an explicit keyword-only param, so the message points to the deferred extension instead of letting an unknown kwarg surface as a bare `TypeError`. Covariate-adjusted HAD identification is the paper's Appendix B.1 / Theorem 6 multivariate-covariate extension (a multivariate nonparametric regression of ΔY on (D, X) at the dose boundary), which is not derived in the library. Workaround: pre-residualize the outcome on the covariates before calling `fit()`, or omit `covariates=` for the unconditional WAS estimand. Lock in `tests/test_methodology_had.py::TestHADDeviations::test_covariates_not_implemented_is_documented`.
 
 **Requirements checklist (tracks implementation phase completion):**
 - [x] Phase 1a: Epanechnikov / triangular / uniform kernels with closed-form `κ_k` constants (`diff_diff/local_linear.py`).
@@ -2978,7 +2980,7 @@ Shipped in `diff_diff/had_pretests.py` as `stute_joint_pretest()` (residuals-in
 - [x] Phase 5 (wave 2 second slice): T22 weighted/survey HAD tutorial (`docs/tutorials/22_had_survey_design.ipynb`) - shipped as the follow-up to PR #432. End-to-end walkthrough of `HeterogeneousAdoptionDiD` + `did_had_pretest_workflow` under `SurveyDesign(weights, strata, psu, fpc)` on a BRFSS-shape state-rollout panel (5 strata x 6 PSUs/stratum x 2 states/PSU = 60 states; post-stratification raking weights with CV ~ 0.30; FPC = 30 PSUs/stratum). Companion drift-test file `tests/test_t22_had_survey_design_drift.py` (32 tests pinning panel composition, naive-vs-survey SE inflation direction, design auto-detection, event-study cband-vs-pointwise width ordering, `_QUG_DEFERRED_SUFFIX` substring on `report.verdict` for both overall and event-study paths, the distinct `report.summary()` QUG-skip note on the event-study path, deterministic Yatchew sigma2_*, bootstrap p-value anchored windows of total width 0.30 (± 0.15 around seeded centers) per `feedback_strata_bootstrap_path_divergence`, workflow-surface separation between overall and event-study paths, and the weighted point-estimation contract via the `_fit_continuous` algebraic identity).
 - [x] Documentation of non-testability of Assumptions 5 and 6. **Closed 2026-05-20:** `HeterogeneousAdoptionDiD` class docstring carries a "Non-testable assumptions (paper Section 3.1.2)" Notes block; `qug_test` / `stute_test` / `yatchew_hr_test` / `did_had_pretest_workflow` Notes sections carry "Scope (what this test does NOT cover)" clauses explicitly stating they verify ADJACENT identifying conditions (QUG: support-infimum null `d_lower = 0`; Stute / Yatchew: Assumption 8 linearity; `joint_pretrends_test`: Assumption 7 mean-independence) and CANNOT test Assumptions 5 or 6. The composite workflow verdict string does NOT mention Assumptions 5 or 6 — it only flags the Assumption 7 step-2 gap on the two-period `aggregate="overall"` path. The Assumption 5/6 non-testability caveat is surfaced separately by (a) `HAD.fit()`'s fit-time `UserWarning` in `diff_diff/had.py` (search for "---- Assumption 5/6 warning on Design 1 paths ----") which fires whenever the resolved design is Design 1 family (`continuous_near_d_lower` or `mass_point`), and (b) T21 (HAD pretest workflow tutorial) tutorial prose.
 - [x] Warnings for staggered treatment timing (redirect to `ChaisemartinDHaultfoeuille`). **Closed 2026-05-20:** fail-closed `ValueError` at `diff_diff/had.py:1511` (see Deviations § "Library extension: Staggered-timing fail-closed" for the rationale on raising vs warning).
-- [ ] `NotImplementedError` phase pointer when `covariates=` is passed (Theorem 6 future work). **Status 2026-05-20:** current behavior is a Python `TypeError` (the `covariates=` kwarg is not in the `HAD.fit()` signature). Adding an explicit `**kwargs`-trap with `NotImplementedError` and a Theorem 6 pointer is a follow-up PR; tracked in `TODO.md` as Low priority — the existing TypeError is fail-closed.
+- [x] `NotImplementedError` phase pointer when `covariates=` is passed (Theorem 6 future work). **Closed 2026-06-01:** `HAD.fit()` now takes an explicit keyword-only `covariates=None` param and raises `NotImplementedError` (with the Appendix B.1 / Theorem 6 multivariate-covariate-extension pointer + a pre-residualization workaround) when it is not None, replacing the prior bare `TypeError` from the absent kwarg. See the `- **Note:**` ("`covariates=` is reserved but NOT implemented") above and `diff_diff/had.py::HeterogeneousAdoptionDiD.fit`; locked by `tests/test_methodology_had.py::TestHADDeviations::test_covariates_not_implemented_is_documented`.
 
 ---
 
diff --git a/docs/methodology/papers/dechaisemartin-2026-review.md b/docs/methodology/papers/dechaisemartin-2026-review.md
index c943e382..375a3e1a 100644
--- a/docs/methodology/papers/dechaisemartin-2026-review.md
+++ b/docs/methodology/papers/dechaisemartin-2026-review.md
@@ -188,7 +188,7 @@ Alternative to Stute when `G` is large or heteroskedasticity is suspected.
 - [x] Yatchew heteroskedasticity-robust linearity test. **Phase 3 implementation (2026-04):** `yatchew_hr_test()` in `diff_diff/had_pretests.py`. Test statistic `T_hr = sqrt(G)·(σ²_lin - σ²_diff)/σ²_W` from paper Equation 29. `σ²_diff` normalizes by `2G` (paper-literal), NOT `2(G-1)` (finite-sample equivalent but tests pin the paper-literal form). Standard-normal critical value, one-sided.
 - [x] Composite workflow `did_had_pretest_workflow()` (paper Section 4.2-4.3). **Phase 3 implementation (2026-04):** `aggregate="overall"` (default, two-period) runs QUG + Stute + Yatchew on a two-period panel; step 2 is NOT run on this path because a two-period panel has no pre-period placebo horizon. **Phase 3 follow-up (2026-04):** `aggregate="event_study"` (multi-period) runs QUG at F + joint pre-trends Stute + joint homogeneity-linearity Stute; closes the paper step-2 gap.
 - [x] Warnings for staggered treatment timing (direct users to existing `ChaisemartinDHaultfoeuille` in diff-diff). **Phase 4 closure (2026-05-20):** fail-closed `ValueError` at `diff_diff/had.py:1511` when multiple first-treat cohorts are detected without `first_treat_col`; the error message directs the user to either supply `first_treat_col` (which activates the last-cohort + never-treated auto-filter per Appendix B.2) or to use `ChaisemartinDHaultfoeuille` (`did_multiplegt_dyn`) for full staggered support. The fail-closed choice (over `UserWarning`) is documented in REGISTRY Deviations § "Staggered-timing fail-closed" as a library extension toward stricter safety than the paper's "Warn" prescription.
-- [ ] Warnings for extensive-margin effects / positive mass of untreated (not fatal; suggests running existing DiD). **Status 2026-05-20 (partial):** `qug_test()` filters zero-dose observations upfront with a `UserWarning` naming the exclusion count — surfaces the *presence* of extensive-margin / positive-mass-of-untreated units to users running pre-tests. The paper-language "suggests running existing DiD" recommendation is NOT a separate fit-time warning on the main `HeterogeneousAdoptionDiD.fit()` path; this item remains open as a Low-priority follow-up tracked in `TODO.md`.
+- [x] Warnings for extensive-margin effects / positive mass of untreated (not fatal; suggests running existing DiD). **Closed 2026-06-01:** `HeterogeneousAdoptionDiD.fit()` now emits a fit-time `UserWarning` on the **overall** path when `>= 10%` of units have an exactly-zero post-period dose — pointing the user to a standard DiD per the Section 4 recommendation. The 10% cutoff is a library convention (the paper prescribes "warn" with NO numeric threshold and explicitly retains small untreated shares, e.g. Garrett et al.'s 12/2954 ≈ 0.4% with close-to-nominal coverage), chosen ~25× above that kept example. Overall-path-only because the event-study path *requires* never-treated units per Appendix B.2 (so an untreated mass is expected there, not a misuse signal). This complements the pre-existing `qug_test()` zero-dose `UserWarning`, which surfaces the *presence* of extensive-margin / positive-mass-of-untreated units only when the user runs the pre-tests. Documented in REGISTRY § HeterogeneousAdoptionDiD ("Note (Extensive-margin / positive-untreated-mass fit-time warning)"); locked by `tests/test_methodology_had.py::TestHADDeviations::test_extensive_margin_warning_is_10pct_library_convention`.
 - [x] Documentation of non-testability of Assumptions 5 and 6. **Phase 4 closure (2026-05-20):** `HeterogeneousAdoptionDiD.fit()` emits a `UserWarning` at fit time when `resolved_design ∈ {continuous_near_d_lower, mass_point}` (Design 1 family) explicitly flagging that point identification of `WAS_{d_lower}` requires Assumption 6, sign identification requires Assumption 5, and NEITHER is testable via pre-trends (`diff_diff/had.py`, search for "---- Assumption 5/6 warning on Design 1 paths ----"). The `HeterogeneousAdoptionDiD` class docstring + `qug_test` / `stute_test` / `yatchew_hr_test` / `did_had_pretest_workflow` Notes sections cross-reference this and explicitly state that the available pre-tests verify ADJACENT identifying conditions: QUG tests the Theorem 4 / Design 1' support-infimum null `d_lower = 0` — adjacent evidence on the `d_lower = 0` clause of Assumption 4 only, NOT a test of full Assumption 4's boundary-density / conditional-mean smoothness / variance regularity statement; the raw `stute_test` / `yatchew_hr_test` helpers test Assumption 8 linearity (residuals from `dy ~ 1 + d`); `joint_pretrends_test` tests Assumption 7 mean-independence (intercept-only residuals via `null_form="mean_independence"`). None of these test Assumptions 5 or 6 directly. The composite workflow verdict string does NOT mention Assumptions 5 or 6 — it only flags the Assumption 7 step-2 gap on the two-period `aggregate="overall"` path. The Assumption 5/6 caveat is surfaced separately by the Design 1 fit-time `UserWarning` and by T21 tutorial prose.
 - [x] Multi-period event-study extension (Appendix B.2). **Phase 2b implementation (2026-04):** `aggregate="event_study"` returns per-event-time WAS estimates using uniform `F-1` anchor. Staggered-timing contract (see L190 closure for full statement): when `first_treat_col` is supplied, the panel auto-filters to last-cohort + never-treated units with a `UserWarning` per Appendix B.2 prescription; when omitted on a multi-cohort panel, the estimator raises `ValueError` (fail-closed, see REGISTRY § "Library extension: Staggered-timing fail-closed"). Pointwise CIs per horizon (no joint cross-horizon covariance; matches paper's Pierce-Schott Figure 2). Pre-period placebos at `e <= -2`; the anchor `e = -1` is skipped since `ΔY = 0` there by construction.
 - [x] Joint Stute tests (paper Section 4.2 step 2 + Section 4.3 joint extension, pages 23-25 + 32). **Phase 3 follow-up (2026-04):** `stute_joint_pretest()` (residuals-in core) + `joint_pretrends_test()` (mean-independence null) + `joint_homogeneity_test()` (linearity null) in `diff_diff/had_pretests.py`. Sum-of-CvMs aggregation, shared-η Mammen wild bootstrap across horizons (Delgado-Manteiga 2001), per-horizon exact-linear short-circuit. **Eq (18) linear-trend detrending variant SHIPPED (PR #389):** the `trends_lin: bool = False` keyword-only kwarg on `HeterogeneousAdoptionDiD.fit(aggregate="event_study")`, `joint_pretrends_test`, and `joint_homogeneity_test` applies the per-group linear-trend slope `Y[g, F-1] - Y[g, F-2]` adjustment. R parity validated against `DIDHAD::did_had(..., trends_lin=TRUE)` v2.0.0 (`Credible-Answers/did_had`) — see REGISTRY § "Note (Phase 4 — Eq 17 / Eq 18 linear-trend detrending shipped)". The Pierce-Schott (2016) NUMERICAL REPLICATION against the published p=0.51 anchor on the LBD-restricted panel is waived per REGISTRY Deviations Note #3.
diff --git a/tests/test_had.py b/tests/test_had.py
index 2e2273c7..e9e187c2 100644
--- a/tests/test_had.py
+++ b/tests/test_had.py
@@ -5635,3 +5635,136 @@ def test_mass_point_default_vcov_robust_true_survey_allowed(self):
             r = est.fit(panel, "outcome", "dose", "period", "unit", survey=sd)
         assert r.vcov_type == "hc1"
         assert r.variance_formula == "survey_binder_tsl_2sls"
+
+
+# =============================================================================
+# TODO L74: extensive-margin / positive-untreated-mass fit-time warning
+# =============================================================================
+
+_EXTENSIVE_MARGIN_SUBSTR = "exactly-zero post-period dose"
+
+
+def _panel_with_zero_fraction(G, n_zero, seed=0):
+    """continuous_at_zero 2-period panel with EXACTLY ``n_zero`` zero post doses.
+
+    The positive interior is drawn from Uniform(0.2, 1.0) so no accidental
+    zeros sneak in — the exactly-zero fraction is precisely ``n_zero / G``.
+    """
+    rng = np.random.default_rng(seed)
+    d = rng.uniform(0.2, 1.0, G)
+    d[:n_zero] = 0.0
+    dy = 0.3 * d + 0.1 * rng.standard_normal(G)
+    return _make_panel(d, dy)
+
+
+class TestExtensiveMarginWarning:
+    """The overall ``fit()`` path warns above a 10% exactly-zero-dose cutoff.
+
+    Locks the TODO L74 fit-time UserWarning: HAD targets a WAS assuming no
+    genuine untreated group, so a substantial exactly-zero (untreated) mass
+    suggests a real extensive margin where standard DiD may be preferable.
+    """
+
+    def test_fires_above_threshold(self):
+        # 40/200 = 20% exactly-zero -> warning fires.
+        panel = _panel_with_zero_fraction(200, 40, seed=0)
+        with pytest.warns(UserWarning, match=_EXTENSIVE_MARGIN_SUBSTR):
+            HeterogeneousAdoptionDiD().fit(panel, "outcome", "dose", "period", "unit")
+
+    def test_fires_exactly_at_threshold(self):
+        # 20/200 = 10% exactly -> the >= cutoff fires at the boundary.
+        panel = _panel_with_zero_fraction(200, 20, seed=1)
+        with pytest.warns(UserWarning, match=_EXTENSIVE_MARGIN_SUBSTR):
+            HeterogeneousAdoptionDiD().fit(panel, "outcome", "dose", "period", "unit")
+
+    def test_message_names_count_and_pct(self):
+        panel = _panel_with_zero_fraction(200, 40, seed=0)
+        with pytest.warns(UserWarning) as rec:
+            HeterogeneousAdoptionDiD().fit(panel, "outcome", "dose", "period", "unit")
+        msgs = [str(w.message) for w in rec if _EXTENSIVE_MARGIN_SUBSTR in str(w.message)]
+        assert len(msgs) == 1
+        # Names the count/total and percentage, and points to standard DiD.
+        assert "40/200" in msgs[0]
+        assert "20%" in msgs[0]
+        assert "standard DiD" in msgs[0]
+
+    def test_no_fire_all_positive(self):
+        # No exactly-zero units -> no extensive-margin warning.
+        panel = _panel_with_zero_fraction(200, 0, seed=2)
+        with warnings.catch_warnings(record=True) as rec:
+            warnings.simplefilter("always")
+            HeterogeneousAdoptionDiD().fit(panel, "outcome", "dose", "period", "unit")
+        assert not any(_EXTENSIVE_MARGIN_SUBSTR in str(w.message) for w in rec)
+
+    def test_no_fire_just_below_threshold(self):
+        # 19/200 = 9.5% < 10% -> no warning (boundary no-fire).
+        panel = _panel_with_zero_fraction(200, 19, seed=3)
+        with warnings.catch_warnings(record=True) as rec:
+            warnings.simplefilter("always")
+            HeterogeneousAdoptionDiD().fit(panel, "outcome", "dose", "period", "unit")
+        assert not any(_EXTENSIVE_MARGIN_SUBSTR in str(w.message) for w in rec)
+
+    def test_event_study_with_never_treated_does_not_warn(self):
+        # Scope lock: the event-study path REQUIRES never-treated units
+        # (Appendix B.2), so a 20% never-treated mass must NOT trip the
+        # overall-path extensive-margin warning. The warning code sits after
+        # the event-study dispatch returns, so it is structurally unreachable
+        # here — this test guards against a future re-placement regressing it.
+        rng = np.random.default_rng(4)
+        d_at_F = rng.uniform(0.2, 1.0, 200)
+        d_at_F[:40] = 0.0  # 20% never-treated (dose 0 at every period)
+        panel = _make_multi_period_panel(d_at_F, n_periods=5, F=3, seed=4)
+        with warnings.catch_warnings(record=True) as rec:
+            warnings.simplefilter("always")
+            _fit_es(
+                HeterogeneousAdoptionDiD(),
+                panel,
+                "outcome",
+                "dose",
+                "period",
+                "unit",
+            )
+        assert not any(_EXTENSIVE_MARGIN_SUBSTR in str(w.message) for w in rec)
+
+
+class TestCovariatesTrap:
+    """TODO L73: ``fit(covariates=...)`` raises NotImplementedError.
+
+    Covariate-adjusted HAD (de Chaisemartin et al. 2026, Appendix B.1 /
+    Theorem 6) is not implemented; the explicit param surfaces the roadmap
+    instead of a bare ``TypeError`` from an unknown kwarg.
+    """
+
+    def test_covariates_raises_overall(self):
+        d, dy = _dgp_continuous_at_zero(200, seed=0)
+        panel = _make_panel(d, dy)
+        with pytest.raises(NotImplementedError, match="Appendix B.1"):
+            HeterogeneousAdoptionDiD().fit(
+                panel, "outcome", "dose", "period", "unit", covariates=["x"]
+            )
+
+    def test_covariates_raises_event_study(self):
+        # Raises before the event-study dispatch, so any panel suffices.
+        d, dy = _dgp_continuous_at_zero(200, seed=0)
+        panel = _make_panel(d, dy)
+        with pytest.raises(NotImplementedError, match="multivariate"):
+            HeterogeneousAdoptionDiD().fit(
+                panel,
+                "outcome",
+                "dose",
+                "period",
+                "unit",
+                aggregate="event_study",
+                covariates=["x"],
+            )
+
+    def test_covariates_none_default_does_not_raise(self):
+        # The default covariates=None preserves the pre-PR fit path.
+        d, dy = _dgp_continuous_at_zero(400, seed=0)
+        panel = _make_panel(d, dy)
+        with warnings.catch_warnings():
+            warnings.simplefilter("ignore", UserWarning)
+            r = HeterogeneousAdoptionDiD().fit(
+                panel, "outcome", "dose", "period", "unit", covariates=None
+            )
+        assert np.isfinite(r.att)
diff --git a/tests/test_methodology_had.py b/tests/test_methodology_had.py
index 93da68df..baa4ba02 100644
--- a/tests/test_methodology_had.py
+++ b/tests/test_methodology_had.py
@@ -1273,3 +1273,67 @@ def _make_event_study_panel(rng: np.random.Generator, G: int) -> pd.DataFrame:
                     }
                 )
         return pd.DataFrame(rows)
+
+    @staticmethod
+    def _zero_fraction_panel(n_zero: int, seed: int, G: int = 200) -> pd.DataFrame:
+        """continuous_at_zero 2-period panel with EXACTLY ``n_zero`` zero doses."""
+        rng = np.random.default_rng(seed)
+        d = rng.uniform(0.2, 1.0, G)
+        d[:n_zero] = 0.0
+        dy = 0.3 * d + 0.1 * rng.standard_normal(G)
+        units = np.repeat(np.arange(G), 2)
+        periods = np.tile([1, 2], G)
+        dose = np.column_stack([np.zeros(G), d]).ravel()
+        outcome = np.column_stack([np.zeros(G), dy]).ravel()
+        return pd.DataFrame({"unit": units, "period": periods, "dose": dose, "outcome": outcome})
+
+    def test_extensive_margin_warning_is_10pct_library_convention(self) -> None:
+        """Locks TODO L74: the extensive-margin warning is a 10% library convention.
+
+        The paper (de Chaisemartin et al. 2026, Section 2 / Assumption 3)
+        prescribes warning users with a positive untreated mass but gives NO
+        numeric cutoff, and explicitly RETAINS small untreated shares (Garrett
+        et al. 12/2954 ~ 0.4%, nominal coverage). The library picks a 10%
+        exactly-zero-dose fraction as the fire threshold — documented in
+        REGISTRY § HeterogeneousAdoptionDiD. This pins both the constant and
+        the fire/no-fire boundary so the convention cannot drift silently.
+        """
+        from diff_diff.had import _HAD_EXTENSIVE_MARGIN_ZERO_DOSE_FRAC
+
+        assert _HAD_EXTENSIVE_MARGIN_ZERO_DOSE_FRAC == 0.10
+
+        substr = "exactly-zero post-period dose"
+        # At/above 10% (20/200) -> fires.
+        with pytest.warns(UserWarning, match=substr):
+            HeterogeneousAdoptionDiD().fit(
+                self._zero_fraction_panel(20, seed=_BASE_SEED_DEVIATIONS + 10),
+                "outcome",
+                "dose",
+                "period",
+                "unit",
+            )
+        # Just below 10% (19/200 = 9.5%) -> does not fire.
+        with warnings.catch_warnings(record=True) as rec:
+            warnings.simplefilter("always")
+            HeterogeneousAdoptionDiD().fit(
+                self._zero_fraction_panel(19, seed=_BASE_SEED_DEVIATIONS + 11),
+                "outcome",
+                "dose",
+                "period",
+                "unit",
+            )
+        assert not any(substr in str(w.message) for w in rec)
+
+    def test_covariates_not_implemented_is_documented(self) -> None:
+        """Locks TODO L73: fit(covariates=...) raises NotImplementedError.
+
+        Covariate-adjusted HAD identification (de Chaisemartin et al. 2026,
+        Appendix B.1 / Theorem 6) is deferred; the explicit ``covariates=``
+        param raises NotImplementedError with the paper pointer rather than a
+        bare TypeError. Documented in REGISTRY § HeterogeneousAdoptionDiD.
+        """
+        panel = self._zero_fraction_panel(1, seed=_BASE_SEED_DEVIATIONS + 12)
+        with pytest.raises(NotImplementedError, match="Theorem 6"):
+            HeterogeneousAdoptionDiD().fit(
+                panel, "outcome", "dose", "period", "unit", covariates=["x1"]
+            )