Skip to content

Commit 75ff0e2

Browse files
igerberclaude
andcommitted
Address PR #456 R7 review (1 P2 perf + 1 P2 docs)
P2 perf: extend the trigger_onset_per_row precomputation to the non- staggered event-study branch. Previously only the staggered fit() call populated `trigger_onset_per_row_cached`; the non-staggered branch left it as None, so `_compute_event_time_per_row` fell back to its dense cohort loop AGAIN, defeating the sparse cutoff in `_compute_nearest_treated_distance_static` and adding avoidable cost on large non-staggered panels. Fix: on the non-staggered branch, derive trigger_onset_per_row directly from `unit_to_d` (already built from the static distance result) and the single shared effective onset. In the non-staggered case the trigger collapses to a constant: any unit within `d_bar` triggers at `shared_effective_onset`; far-away units have NaN trigger. No second pairwise pass needed. P2 docs: remove the unwired `diagnostic_report.event_study_diagnostics` consumability claim from CHANGELOG, llms-full.txt, results.py docstring, REGISTRY.md, and api/spillover.rst. SpilloverDiDResults is NOT registered in DiagnosticReport's `_APPLICABILITY` / `_PT_METHOD` tables, so `DiagnosticReport(spillover_result)` does not route to event-study diagnostics. `plot_event_study` integration IS wired and keeps its claim. Added a TODO row tracking the deferred DiagnosticReport routing. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent cfb822e commit 75ff0e2

7 files changed

Lines changed: 32 additions & 6 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Large diffs are not rendered by default.

TODO.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -136,6 +136,7 @@ Deferred items from PR reviews that were not addressed before merge.
136136
| `SpilloverDiD` T22 TVA tutorial (`docs/tutorials/22_spillover_did.ipynb`): synthetic TVA-style DGP reproducing Butts (2021) Section 4 Table 1 Panel A bias-correction direction (~40% understatement). Split from the methodology PR per user-confirmed scope split (2026-05-15). | `docs/tutorials/`, `tests/test_t22_*_drift.py` | follow-up (Wave B) | Medium |
137137
| Extend `TwoStageDiD` with Conley vcov as a first-class feature (mirrors Wave A's TWFE/MPD/DiD extension). Currently `TwoStageDiD.__init__` lacks `vcov_type` / `conley_*` kwargs; `SpilloverDiD` works around this by threading Conley directly via `solve_ols` at stage 2. Promoting Conley to TwoStageDiD's API removes the workaround and lets non-spillover users access Conley + Gardner two-stage. | `diff_diff/two_stage.py` | follow-up | Medium |
138138
| `SpilloverDiD` sparse cKDTree path for the staggered nearest-treated-distance helper (mirrors the static helper's sparse branch). Currently `_compute_nearest_treated_distance_staggered` always builds dense `(n_units, n_treated_by_onset)` pairwise distance matrices per cohort; on large staggered panels with many cohorts this is avoidable memory/runtime. Add a sparse k-d-tree branch analogous to `_compute_nearest_treated_distance_sparse`, gated on `n > _CONLEY_SPARSE_N_THRESHOLD`. | `spillover.py::_compute_nearest_treated_distance_staggered` | follow-up (Wave B) | Low |
139+
| `SpilloverDiDResults` in `DiagnosticReport` dispatch tables. Wave C event-study emits a TwoStageDiD-compatible `event_study_effects: Dict[int, Dict]` alias that `plot_event_study` consumes via the new `reference_period` attribute fallback in `_extract_plot_data`, but `SpilloverDiDResults` is NOT registered in `DiagnosticReport`'s `_APPLICABILITY` / `_PT_METHOD` tables — so `DiagnosticReport(spillover_result)` doesn't currently route to event-study diagnostics. Registering requires (a) deciding which diagnostics apply (parallel trends, pre-trends power, heterogeneity, design-effect) AND (b) adding an end-to-end test. | `diff_diff/diagnostic_report.py::_APPLICABILITY`, `_PT_METHOD` | follow-up (Wave C) | Low |
139140

140141
#### Performance
141142

diff_diff/guides/llms-full.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -502,7 +502,7 @@ sp.fit(
502502

503503
- `covariates=` raises `NotImplementedError`. Gardner two-stage requires covariate effects estimated on the untreated-and-unexposed Omega_0 subsample at stage 1; appending raw covariates only at stage 2 silently biases `tau_total` / `delta_j` on panels with time-varying covariates. Planned follow-up.
504504
- `survey_design=` raises `NotImplementedError` (planned: SurveyDesign integration)
505-
- `event_study=True` SHIPPED (Wave C): emits per-event-time `tau_k` and per-(ring, event-time) `delta_jk` as `att_dynamic: pd.DataFrame` (indexed by event-time `k`) plus MultiIndex `spillover_effects: pd.DataFrame` (levels `(ring_label, event_time)`). TwoStageDiD-compatible `event_study_effects: Dict[int, Dict]` alias also emitted for `plot_event_study` / `diagnostic_report.event_study_diagnostics` consumption (schema: `{k: {"effect", "se", "n_obs", "t_stat", "p_value", "conf_int": (low, high)}}` mirroring `two_stage.py:1355-1389`). Reference period `ref_period = -1 - anticipation` (TwoStageDiD `two_stage.py:486` convention); reference row uses `coef=0.0, se=0.0, n_obs=0, conf_int=(0.0, 0.0)`. Scalar `att` field becomes a sample-share-weighted average of post-treatment `tau_k` (`att = sum_{k>=0} w_k * tau_k` with `w_k = n_treated_at_k / total`) with SE from linear-combination inference `Var(att) = w' V_subset w` on the post-treatment vcov block — no separate fit. **Two-clock K_it:** direct-effect clock is `K_direct = t - effective_first_treat(i)` for ever-treated rows; spillover clock is `K_spill = t - earliest-in-range-cohort-onset(i)` (running min across activated cohorts, NaN pre-trigger). `K_spill >= 0` structurally; negative-k spillover cells are rectangularly emitted with `coef = NaN, n_obs = 0`. **`horizon_max` semantics:** bins event-times outside `[-H, +H]` into endpoint pools (no observations dropped — divergence from TwoStageDiD which filters; intentional, per `feedback_no_silent_failures`). With `horizon_max=None`, auto-detects bin set from observed K. **Validation:** `horizon_max < 0` raises `ValueError`; `ref_period < -horizon_max` (i.e., `anticipation > horizon_max - 1`) raises `ValueError` — silently floor-shifting the reference would change identification. **Reduce-to-aggregate:** under constant-tau DGP with `horizon_max=None`, the share-weighted scalar `att` reproduces Wave B's aggregate bit-identically. **Note:** `horizon_max=0` does NOT reduce to Wave B (binning collapses pre-treatment K values to `k=0`, making `D^0 = D_i` ever-treated indicator rather than `D_it`). Per-event-time SEs share the same Wave B Gardner-GMM caveat (biased downward by a few percent; Wave D follow-up).
505+
- `event_study=True` SHIPPED (Wave C): emits per-event-time `tau_k` and per-(ring, event-time) `delta_jk` as `att_dynamic: pd.DataFrame` (indexed by event-time `k`) plus MultiIndex `spillover_effects: pd.DataFrame` (levels `(ring_label, event_time)`). TwoStageDiD-compatible `event_study_effects: Dict[int, Dict]` alias also emitted for `plot_event_study` consumption — `_extract_plot_data` prefers the new `reference_period` attribute over the legacy `n_obs==0` heuristic. (DiagnosticReport integration: NOT yet wired; queued as a follow-up.) (schema: `{k: {"effect", "se", "n_obs", "t_stat", "p_value", "conf_int": (low, high)}}` mirroring `two_stage.py:1355-1389`). Reference period `ref_period = -1 - anticipation` (TwoStageDiD `two_stage.py:486` convention); reference row uses `coef=0.0, se=0.0, n_obs=0, conf_int=(0.0, 0.0)`. Scalar `att` field becomes a sample-share-weighted average of post-treatment `tau_k` (`att = sum_{k>=0} w_k * tau_k` with `w_k = n_treated_at_k / total`) with SE from linear-combination inference `Var(att) = w' V_subset w` on the post-treatment vcov block — no separate fit. **Two-clock K_it:** direct-effect clock is `K_direct = t - effective_first_treat(i)` for ever-treated rows; spillover clock is `K_spill = t - earliest-in-range-cohort-onset(i)` (running min across activated cohorts, NaN pre-trigger). `K_spill >= 0` structurally; negative-k spillover cells are rectangularly emitted with `coef = NaN, n_obs = 0`. **`horizon_max` semantics:** bins event-times outside `[-H, +H]` into endpoint pools (no observations dropped — divergence from TwoStageDiD which filters; intentional, per `feedback_no_silent_failures`). With `horizon_max=None`, auto-detects bin set from observed K. **Validation:** `horizon_max < 0` raises `ValueError`; `ref_period < -horizon_max` (i.e., `anticipation > horizon_max - 1`) raises `ValueError` — silently floor-shifting the reference would change identification. **Reduce-to-aggregate:** under constant-tau DGP with `horizon_max=None`, the share-weighted scalar `att` reproduces Wave B's aggregate bit-identically. **Note:** `horizon_max=0` does NOT reduce to Wave B (binning collapses pre-treatment K values to `k=0`, making `D^0 = D_i` ever-treated indicator rather than `D_it`). Per-event-time SEs share the same Wave B Gardner-GMM caveat (biased downward by a few percent; Wave D follow-up).
506506
- Stage-2 variance is `solve_ols` HC1 / Conley / cluster — Gardner GMM first-stage uncertainty correction NOT applied (planned follow-up; SE is biased downward / too small, CIs too narrow, p-values too small — treat reported significance conservatively until the GMM correction lands)
507507
- Only nearest-treated rings supported; `ring_method="count"` (count of treated neighbors in ring) not yet exposed
508508

diff_diff/results.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -415,7 +415,11 @@ class SpilloverDiDResults(DiDResults):
415415
# Includes the reference period row with coef=0.0, se=0.0, n_obs=0.
416416
event_study_effects: Optional[Dict[int, Dict[str, Any]]] = field(default=None)
417417
# TwoStageDiD-compatible alias for ``att_dynamic`` consumable by
418-
# ``plot_event_study`` and ``diagnostic_report.event_study_diagnostics``.
418+
# ``plot_event_study`` (wired in Wave C via the ``reference_period``
419+
# attribute fallback in ``_extract_plot_data``). ``DiagnosticReport``
420+
# routing is NOT yet wired — registering ``SpilloverDiDResults`` in
421+
# ``DiagnosticReport``'s applicability/method tables is a planned
422+
# follow-up (see TODO.md).
419423
# Schema mirrors ``two_stage.py:1355-1389``:
420424
# {k: {"effect", "se", "n_obs", "t_stat", "p_value", "conf_int": (low, high)}}
421425
# Reference row uses ``conf_int = (0.0, 0.0)`` (TwoStageDiD parity).

diff_diff/spillover.py

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2272,6 +2272,25 @@ def fit(
22722272
np.inf,
22732273
d_it_per_row,
22742274
)
2275+
# PR #456 R7 perf fix: derive trigger_onset_per_row directly
2276+
# from the static distance result for the event-study path. In
2277+
# the non-staggered case there's only one cohort onset, so the
2278+
# trigger collapses to the shared effective onset for any unit
2279+
# within d_bar (NaN for far-away units). Avoids hitting
2280+
# `_compute_event_time_per_row`'s dense-fallback cohort loop.
2281+
if self.event_study:
2282+
d_per_unit_inrange = np.array(
2283+
[
2284+
(
2285+
shared_effective_onset
2286+
if unit_to_d.get(u, np.inf) <= self._effective_d_bar
2287+
else np.nan
2288+
)
2289+
for u in unit_vals
2290+
],
2291+
dtype=np.float64,
2292+
)
2293+
trigger_onset_per_row_cached = d_per_unit_inrange
22752294

22762295
# Step 6: build ring indicators per row (Butts Eq 6 time-varying form).
22772296
ring_masks = _build_ring_indicators(d_it_per_row, list(self.rings))

docs/api/spillover.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -190,8 +190,10 @@ planned as follow-up enhancements:
190190
event-time) spillover effects ``delta_jk`` as a ``att_dynamic``
191191
DataFrame plus MultiIndex ``spillover_effects``. The
192192
``event_study_effects: Dict[int, Dict]`` alias mirrors
193-
``TwoStageDiD``'s schema for ``plot_event_study`` /
194-
``diagnostic_report.event_study_diagnostics`` consumption. Reference
193+
``TwoStageDiD``'s schema for ``plot_event_study`` consumption (the
194+
plotter prefers the new ``reference_period`` attribute over the
195+
legacy ``n_obs==0`` heuristic). ``DiagnosticReport`` routing for
196+
``SpilloverDiDResults`` is queued as a follow-up. Reference
195197
period ``-1 - anticipation`` (TwoStageDiD parity). ``horizon_max``
196198
bins event-times into endpoint pools (no row drop — divergence
197199
from TwoStageDiD's filtering semantic, intentional per

docs/methodology/REGISTRY.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2996,7 +2996,7 @@ where `D_it` is the treatment indicator (1 if unit `i` is treated by time `t`) a
29962996

29972997
### Event-study mode (Wave C, `event_study=True`)
29982998

2999-
Replaces the aggregate spec with the per-event-time × ring decomposition from Butts Section 5 / Table 2. Direct effects `tau_k` and per-(ring, event-time) spillover effects `delta_jk` are emitted in `att_dynamic` and a MultiIndex `spillover_effects`. A TwoStageDiD-compatible `event_study_effects: Dict[int, Dict]` alias (mirroring `two_stage.py:1355-1389` schema with `conf_int = (low, high)` tuple) is also emitted for consumption by `plot_event_study` and `diagnostic_report.event_study_diagnostics`.
2999+
Replaces the aggregate spec with the per-event-time × ring decomposition from Butts Section 5 / Table 2. Direct effects `tau_k` and per-(ring, event-time) spillover effects `delta_jk` are emitted in `att_dynamic` and a MultiIndex `spillover_effects`. A TwoStageDiD-compatible `event_study_effects: Dict[int, Dict]` alias (mirroring `two_stage.py:1355-1389` schema with `conf_int = (low, high)` tuple) is also emitted for `plot_event_study` consumption `_extract_plot_data` prefers the new `reference_period` attribute over the legacy `n_obs==0` heuristic. (`DiagnosticReport` routing is NOT yet wired; queued as a follow-up.)
30003000

30013001
**Note (two-clock K_it):** Butts Section 5 uses one symbol `K_it`, but operationally there are TWO event-time clocks. The **direct-effect clock** is `K_direct_{it} = t - effective_first_treat(i)` for ever-treated unit rows (NaN for never-treated). The **spillover-exposure clock** is `K_spill_{it} = t - min{ effective_first_treat(j) : d(i, j) ≤ d_bar AND effective_first_treat(j) ≤ t }` — time since `i` first became exposed to ANY treated neighbor (running min across activated cohorts). `K_spill` is structurally non-negative (pre-trigger rows are NaN and contribute to stage 1 only); spillover placebos at `k < 0` are therefore NOT meaningful and rectangular emission of those cells uses NaN coef + `n_obs = 0`. The trigger cohort for unit `i` is the earliest activated cohort whose treated members fall within `d_bar` of `i` — NOT necessarily the geographically nearest cohort.
30023002

0 commit comments

Comments
 (0)