docs: add retrospective paper reviews for TROP and Wooldridge ETWFE#443
Conversation
Both reviews follow the existing template under docs/methodology/papers/ and back already-shipped estimators (diff_diff/trop.py, diff_diff/wooldridge.py). - athey-2025-review.md — Athey, Imbens, Qu, Viviano (2025) "Triply Robust Panel Estimators" (arXiv:2508.21536) - wooldridge-2023-review.md — Wooldridge (2023) "Simple approaches to nonlinear difference-in-differences with panel data" (Econometrics Journal 26(3), doi:10.1093/ectj/utad016) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Overall Assessment✅ Looks good This is a docs-only PR, so there are no unmitigated P0/P1 findings. The main issues are P2/P3 documentation accuracy problems where the new paper-review files drift from the shipped registry/code. Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
athey-2025-review.md: - Replace contributor-local absolute PDF path with arXiv URL - Note shipped TROP requires absorbing treatment (paper Eq 13 generalization is out of scope for the current implementation) - Rename "twostep"/"joint" to "local"/"global" and correct the global-method description to residual-based treated-cell effects averaged into ATT wooldridge-2023-review.md: - Split delta_2 interpretation by link function (exponential = log diff / proportional effect; logistic = change in log-odds) - Update control_group default to "not_yet_treated" (matches wooldridge.py:305) - Update implementation note: solve_poisson exists at linalg.py:3431 and is used in the Poisson path - Add aggregation deviation note linking to REGISTRY and TODO entries Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/ai-review |
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
athey-2025-review.md: - Correct "7 real datasets" to "6 real datasets / 7 simulation applications" (CPS is used for both logwage and urate outcomes; paper Table 1 / Section 3 names 6 source datasets) - Rewrite Equation 13 nuclear-norm gap note as a neutral source-based check (remove authoring artifact) wooldridge-2023-review.md: - Surface shipped covariate API (exovar / xtvar / xgvar incl. time-varying via xtvar with demean_covariates default) in Data Structure Requirements and Tuning Parameters table; cross-link to wooldridge.py:394-411 and REGISTRY.md "Covariates" Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment✅ Looks good Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
wooldridge-2023-review.md: - Aggregation note: stop attributing "Eqs. 7.2-7.4" to the 2023 paper (the 2023 paper describes aggregation only conceptually in Section 3.1; the formal cohort-share equations are from W2025 per REGISTRY.md) - Implementation Notes: separate paper notation from shipped API. Users provide cohort/first_treat; W_it is constructed internally from cohort+ time via _build_interaction_matrix (wooldridge.py:165-189), not passed as a column - Standard errors: add shipped-API restriction note — n_bootstrap > 0 is OLS-only (wooldridge.py:432-437) and rejected with survey_design (wooldridge.py:441-444) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment✅ Looks good Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
athey-2025-review.md: - Relabel balanced-panel bullet as "Paper assumption" and add an adjacent shipped-implementation note that diff_diff/trop.py supports unbalanced panels with structural gaps (matches the later "Unbalanced panels" entry under Gaps and Uncertainties and the corresponding REGISTRY section) wooldridge-2023-review.md: - Treatment exit (Section 7.2) bullet: mark as extension and carry over the paper's additional restriction that future shocks to untreated potential outcomes cannot drive exit - Multiple treatment levels (Section 7.4) bullet: mark as extension; note the paper describes it as relatively straightforward but not fully general, leaving the precise multi-level estimand to future work Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
athey-2025-review.md: - Reframe Methodology Registry Entry intro from "copy into REGISTRY" ready-to-promote to a working-draft framing that explicitly defers promotion until two source-ambiguity items (weight normalization, Eq. 13 penalty form) are resolved against the source - Pull the weight-normalization line out of the Requirements Checklist (it was framed as a settled requirement); restate it as an open source-ambiguity cross-referencing Gap #5, with the current shipped implementation pinned to the Equation 2 (unnormalized) interpretation Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
athey-2025-review.md: - Mark Equation 13 nuclear-norm penalty as resolved (paper text confirms the same unsquared form as Equation 2) - Tighten draft-framing intro: weight normalization is the only remaining open source-ambiguity item wooldridge-2023-review.md: - Replace nonexistent local PDF path with the Econometrics Journal DOI URL so the provenance trail is reproducible from the repo Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
athey-2025-review.md: - Replace generic arXiv abstract URL with a version-pinned v2 link so the reviewed artifact resolves to v2 (the current arXiv record now resolves to v3) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
Closes the Athey, Imbens, Qu & Viviano (2025) Triply Robust Panel Estimators (arXiv:2508.21536) primary-source review on the methodology tracker. PR-A (paper review on file at docs/methodology/papers/athey- 2025-review.md) was previously merged as igerber#443; this PR is the F.L.I.P. consolidation: new tests/test_methodology_trop.py with paper-equation- numbered Verified Components walk-through (10 classes, 36 tests covering Eq. 2 nuclear-norm prox / FISTA / weighted-prox, Eq. 3 unit + time weights, Eqs. 4-5 + Algorithm 1 LOOCV with two-stage cycling, Corollary 1 three-condition unbiasedness, Theorem 5.1 MC-ranking realisation, Section 2.2 DID + MC reductions, Eq. 13 + Algorithm 2 per-(i, t) estimation, Algorithm 3 stratified pairs bootstrap, Section 3 / Eq. 6 factor-DGP recovery, plus a TestTROPDeviations class locking 11 documented library deviations). Migrated from tests/test_trop.py: TestMethodologyVerification (5 tests -> TestTROPEquation6FactorDGPRecovery), four paper-conformance tests + one weighted-solver convergence test from TestPaperConformanceFixes (distributed across the new equation-numbered classes), three prox / FISTA / weighted-objective tests from TestTROPNuclearNormSolver (-> TestTROPNuclearNormProx), plus a cycling-convergence test from TestCyclingSearch and the factor-DGP smoke from TestTROPvsSDID. The TestPaperConformanceFixes and TestTROPvsSDID shells are deleted; TestTROPNuclearNormSolver retains its single defensive test_zero_weights_no_division_error. METHODOLOGY_REVIEW.md TROP row promoted In Progress -> Complete (paper method="local") with merge date 2026-05-24, full Verified Components / Test Coverage / Deviations / Outstanding Concerns / R Parity structure mirroring HAD (PR igerber#473), ContinuousDiD (PR igerber#476), DCDH (PR igerber#481), WooldridgeDiD (PR igerber#486). The methodology promotion is scoped to the paper-aligned method="local" path (paper Algorithm 2); method="global" is a library-side efficiency adaptation per REGISTRY and stays defensively covered in tests/test_trop.py::TestTROPGlobalMethod. Documented deviations: Gap igerber#5 (unnormalised weights match Eq. 2, not Section 5 sum-to-one) — locked by a direct kernel-weight inspection test against TROP._compute_observation_weights; Gap igerber#9 (control / pre- treatment cell drops supported beyond paper's balanced-panel assumption); rank selection is implicit via nuclear-norm soft-thresholding (no discrete rank_selection constructor parameter — corrects an earlier REGISTRY overclaim that listed cv / ic / elbow methods); lambda_nn=inf -> 1e10 internal sentinel with original-value storage on results. Outstanding Concerns (deferred): Equation 14 covariate extension (TROP.fit() has no `covariates` kwarg; non-support locked by TestTROPDeviations::test_covariates_not_supported via inspect.signature to guard against future **kwargs) and Theorem 8.1 deferred until use cases motivate. SC / SDID reductions paper-claimed under "specific (omega, theta) weight choices" not provided in the paper text; cross- language anchor deferred until paper-author code clarifies the weight map. Eq. 10 direct numerical reconstruction deferred — requires exposing the internal per-(i, t) theta / omega weight vectors. R parity deferred ("forthcoming" per the paper). Methodology sign-off scope: paper-aligned identification ingredients (Eq. 2 prox + Eq. 3 weights + Eqs. 4-5 LOOCV + Algorithms 1-3 + Corollary 1 single-draw sanity checks + Eq. 6 simulation recovery + DID reduction + documented deviations) are directly locked. Theorem 5.1 is verified as a simulation sanity check (TROP RMSE < DID RMSE under LOOCV-tuned weights), NOT a direct fixed-weight conditional-bias-bound lock. The Matrix Completion reduction is verified as code-path activation (effective_rank > 0 + beats DID baseline), NOT equivalence against an independent MC reference. Plain (non-accelerated) prox- gradient objective monotonicity is locked; the shipped accelerated FISTA outer loop does NOT guarantee per-step monotonicity (Nesterov momentum gives O(1/k^2) but not monotonicity) and is not directly tested. REGISTRY.md ## TROP section gains a Verified Components expansion: 10 ticked requirements + four **Note:** / **Note (paper resolution):** / **Note (deferral):** annotations consolidating the deviation surface. No source-code changes to diff_diff/trop*.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Adds two paper-review markdown files under `docs/methodology/papers/`, following the existing template. Both reviews are retrospective documentation for estimators already shipped in the library.
Methodology references (required if estimator / math changes)
Validation
Security / privacy