Skip to content

Commit 4d02223

Browse files
igerberclaude
andcommitted
Clean up completed items from TODO.md and ROADMAP.md
Remove strikethrough/done sections from TODO.md (NaN handling, safe_inference migration, 10 fixed tech debt rows, Fix Plan subsection, 5 completed module splits). Update ROADMAP.md version to v2.4.1, remove shipped Two-Stage DiD section, bump Near-Term milestone to v2.5. Fix factual error in CHANGELOG.md line 16 (was backwards: actual change was @ -> np.dot, not np.dot -> @). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent b1e0237 commit 4d02223

3 files changed

Lines changed: 4 additions & 43 deletions

File tree

CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1313
### Changed
1414
- Module splits for large files: ImputationDiD, TwoStageDiD, and TROP each split into separate results and bootstrap submodules
1515
- Migrated remaining inline inference computations to `safe_inference()` utility
16-
- Replaced `np.dot()` calls with `@` operator across codebase
16+
- Replaced `@` operator with `np.dot()` at observation-dimension sites to avoid Apple M4 BLAS warnings
1717
- Updated TODO.md and ROADMAP.md for accuracy post-v2.4.0
1818

1919
### Fixed

ROADMAP.md

Lines changed: 2 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ For past changes and release history, see [CHANGELOG.md](CHANGELOG.md).
88

99
## Current Status
1010

11-
diff-diff v2.4.0 is a **production-ready** DiD library with feature parity with R's `did` + `HonestDiD` + `synthdid` ecosystem for core DiD analysis:
11+
diff-diff v2.4.1 is a **production-ready** DiD library with feature parity with R's `did` + `HonestDiD` + `synthdid` ecosystem for core DiD analysis:
1212

1313
- **Core estimators**: Basic DiD, TWFE, MultiPeriod, Callaway-Sant'Anna, Sun-Abraham, Borusyak-Jaravel-Spiess Imputation, Synthetic DiD, Triple Difference (DDD), TROP, Two-Stage DiD (Gardner 2022)
1414
- **Valid inference**: Robust SEs, cluster SEs, wild bootstrap, multiplier bootstrap, placebo-based variance
@@ -20,20 +20,10 @@ diff-diff v2.4.0 is a **production-ready** DiD library with feature parity with
2020

2121
---
2222

23-
## Near-Term Enhancements (v2.4)
23+
## Near-Term Enhancements (v2.5)
2424

2525
High-value additions building on our existing foundation.
2626

27-
### Gardner's Two-Stage DiD (did2s) -- IMPLEMENTED (v2.4)
28-
29-
Two-stage approach gaining traction in applied work. First residualizes outcomes, then estimates effects.
30-
31-
- Stage 1: Estimate unit and time FEs using only untreated observations
32-
- Stage 2: Regress residualized outcomes on treatment indicators
33-
- Clean separation of identification and estimation
34-
35-
**Reference**: Gardner (2022). *Working Paper*.
36-
3727
### Stacked Difference-in-Differences
3828

3929
An intuitive approach that explicitly constructs sub-experiments for each treatment cohort, avoiding forbidden comparisons.

TODO.md

Lines changed: 1 addition & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -23,11 +23,6 @@ Target: < 1000 lines per module for maintainability.
2323

2424
| File | Lines | Action |
2525
|------|-------|--------|
26-
| ~~`staggered.py`~~ | ~~2301~~ 1120 | ✅ Split into staggered.py, staggered_bootstrap.py, staggered_aggregation.py, staggered_results.py |
27-
| ~~`prep.py`~~ | ~~1993~~ 1241 | ✅ Split: DGP functions moved to `prep_dgp.py` (777 lines) |
28-
| ~~`trop.py`~~ | ~~2904~~ ~2560 | ✅ Partially split: results extracted to `trop_results.py` (~340 lines) |
29-
| ~~`imputation.py`~~ | ~~2480~~ ~1740 | ✅ Split into imputation.py, imputation_results.py, imputation_bootstrap.py |
30-
| ~~`two_stage.py`~~ | ~~2209~~ ~1490 | ✅ Split into two_stage.py, two_stage_results.py, two_stage_bootstrap.py |
3126
| `utils.py` | 1780 | Monitor -- legacy placebo function removed |
3227
| `visualization.py` | 1678 | Monitor -- growing but cohesive |
3328
| `linalg.py` | 1537 | Monitor -- unified backend, splitting would hurt cohesion |
@@ -38,16 +33,6 @@ Target: < 1000 lines per module for maintainability.
3833
| `estimators.py` | 1161 | Acceptable |
3934
| `pretrends.py` | 1104 | Acceptable |
4035

41-
### ~~NaN Handling for Undefined t-statistics~~ -- DONE
42-
43-
All 7 t_stat locations fixed (diagnostics.py, sun_abraham.py, triple_diff.py) -- all now use `np.nan` or `np.isfinite()` guards. Fixed in PR #118 and follow-up PRs.
44-
45-
~~**Remaining nuance**: `diagnostics.py:785` SE = 0.0~~ — ✅ Fixed: SE now returns `np.nan` when undefined, and all downstream inference uses `safe_inference()`.
46-
47-
### ~~Migrate Existing Inference Call Sites to `safe_inference()`~~ -- DONE
48-
49-
✅ All ~32 inline inference call sites migrated to `safe_inference()` across 11 source files: `estimators.py`, `sun_abraham.py`, `staggered.py`, `staggered_aggregation.py`, `triple_diff.py`, `imputation.py`, `two_stage.py`, `diagnostics.py`, `synthetic_did.py`, `trop.py`, `utils.py`. Two sites left as-is with comments: `diagnostics.py:665` (permutation-based p_value) and `linalg.py:1310` (deliberately uses ±inf for zero-SE).
50-
5136
---
5237

5338
### Tech Debt from Code Reviews
@@ -58,31 +43,21 @@ Deferred items from PR reviews that were not addressed before merge.
5843

5944
| Issue | Location | PR | Priority |
6045
|-------|----------|----|----------|
61-
| ~~TwoStageDiD & ImputationDiD bootstrap hardcodes Rademacher only; no `bootstrap_weights` parameter unlike CallawaySantAnna~~ | ~~`two_stage_bootstrap.py`, `imputation_bootstrap.py`~~ | ~~#156, #141~~ | ✅ Fixed: Added `bootstrap_weights` parameter to both estimators |
62-
| ~~TwoStageDiD GMM score logic duplicated between analytic/bootstrap with inconsistent NaN/overflow handling~~ | ~~`two_stage.py`, `two_stage_bootstrap.py`~~ | ~~#156~~ | ✅ Fixed: Unified via `_compute_gmm_scores()` static method |
63-
| ~~ImputationDiD weight construction duplicated between aggregation and bootstrap (drift risk)~~ | ~~`imputation.py`, `imputation_bootstrap.py`~~ | ~~#141~~ | ✅ Fixed: Extracted `_compute_target_weights()` helper in `imputation_bootstrap.py` |
6446
| ImputationDiD dense `(A0'A0).toarray()` scales O((U+T+K)^2), OOM risk on large panels | `imputation.py` | #141 | Medium (deferred — only triggers when sparse solver fails; fixing requires sparse least-squares alternatives) |
6547

6648
#### Performance
6749

6850
| Issue | Location | PR | Priority |
6951
|-------|----------|----|----------|
70-
| ~~TwoStageDiD per-column `.toarray()` in loop for cluster scores~~ | ~~`two_stage_bootstrap.py`~~ | ~~#156~~ | ✅ Fixed: Single `.toarray()` call replaces per-column loop |
7152
| ImputationDiD event-study SEs recompute full conservative variance per horizon (should cache A0/A1 factorization) | `imputation.py` | #141 | Low |
72-
| ~~Legacy `compute_placebo_effects` uses deprecated projected-gradient weights~~ | ~~`utils.py:1689-1691`~~ | ~~#145~~ | ✅ Fixed: Removed function entirely |
7353
| Rust faer SVD ndarray-to-faer conversion overhead (minimal vs SVD cost) | `rust/src/linalg.rs:67` | #115 | Low |
7454

7555
#### Testing/Docs
7656

7757
| Issue | Location | PR | Priority |
7858
|-------|----------|----|----------|
7959
| Tutorial notebooks not executed in CI | `docs/tutorials/*.ipynb` | #159 | Low |
80-
| ~~TwoStageDiD `test_nan_propagation` is a no-op~~ | ~~`tests/test_two_stage.py:643-652`~~ | ~~#156~~ | ✅ Fixed |
81-
| ~~ImputationDiD bootstrap + covariate path untested~~ | ~~`tests/test_imputation.py`~~ | ~~#141~~ | ✅ Fixed: Added `test_bootstrap_with_covariates` |
82-
| ~~TROP `n_bootstrap >= 2` validation missing (can yield 0/NaN SE silently)~~ | ~~`trop.py:462`~~ | ~~#124~~ | ✅ Fixed: Added `ValueError` for `n_bootstrap < 2` |
83-
| ~~SunAbraham deprecated `min_pre_periods`/`min_post_periods` still in `fit()` docstring~~ | ~~`sun_abraham.py:458-487`~~ | ~~#153~~ | ✅ Fixed: Removed deprecated params from `fit()` |
8460
| R comparison tests spawn separate `Rscript` per test (slow CI) | `tests/test_methodology_twfe.py:294` | #139 | Low |
85-
| ~~Rust TROP bootstrap SE returns 0.0 instead of NaN for <2 samples~~ | ~~`rust/src/trop.rs:1038-1054`~~ | ~~#115~~ | ✅ Already fixed: Returns `f64::NAN` at `rust/src/trop.rs:1034` |
8661

8762
---
8863

@@ -171,11 +146,7 @@ Spurious RuntimeWarnings ("divide by zero", "overflow", "invalid value") are emi
171146
- Occurs in IPW and DR estimation methods with covariates
172147
- Related to logistic regression overflow in edge cases (separate from BLAS bug)
173148

174-
### ~~Fix Plan (follow-up PR)~~ -- DONE
175-
176-
✅ Replaced `@` operator with `np.dot()` at all 19 affected call sites across 6 files: `linalg.py` (5), `staggered.py` (5), `triple_diff.py` (3), `utils.py` (1), `imputation.py` (4), `trop.py` (1). Regression test added in `test_linalg.py::TestNoDotRuntimeWarnings`.
177-
178-
**Long-term:** Revert to `@` operator when numpy ≥ 2.3 becomes the minimum supported version.
149+
- **Long-term:** Revert to `@` operator when numpy ≥ 2.3 becomes the minimum supported version.
179150

180151
---
181152

0 commit comments

Comments
 (0)