You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
utils: fix wild_bootstrap NaN propagation on rank-deficient designs
CI review (R5) identified a P1 bug in wild_bootstrap_se() that was
newly reachable via the TWFE HC2/HC2-BM full-dummy path:
Before this fix, wild_bootstrap_se built each draw's pseudo-outcome
as `y_star = X @ beta_restricted`. When solve_ols dropped a rank-
deficient nuisance column (e.g. a time-invariant covariate collinear
with the unit FE on the full-dummy design), beta_restricted contained
NaN on the dropped slot, and X @ beta_restricted propagated NaN
through every observation. The ATT was analytically identified but
the bootstrap crashed because y_star was all-NaN.
Pre-PR this was unreachable on TWFE (the within-transform absorbed
time-invariant covariates before they entered X), but the new full-
dummy HC2/HC2-BM branch keeps unit/time dummies explicit alongside
covariates, exposing the bug.
Two fixes in wild_bootstrap_se (diff_diff/utils.py):
1. Use solve_ols(return_fitted=True) to get NaN-safe fitted values
from the kept columns; build y_star = fitted_restricted +
residuals_restricted * obs_weights instead of X @ beta_restricted.
fitted_restricted is computed from the kept columns by solve_ols,
so dropped nuisance NaN doesn't propagate.
2. Replace bootstrap_t_stats[b] = 0.0 fallback for singular draws
with np.nan + a finite_mask filter at the p-value step. Setting
t* = 0 biased the p-value downward (|0| < |t_original| counts as
non-rejection, but those draws are invalid, not non-rejections).
The same nan-safe filter applies to bootstrap_coefs for the SE
and percentile CI.
New regression test
`test_twfe_hc2_wild_bootstrap_survives_rank_deficient_full_dummy`
fits TwoWayFixedEffects(vcov_type='hc2', inference='wild_bootstrap',
covariates=['x_invariant']) on a panel where x_invariant is time-
invariant (collinear with unit FE on the full-dummy design); asserts
finite ATT, SE, p-value, and CI. Pre-fix this test crashed with
all-NaN y_star.
No regression in the existing 53 wild_bootstrap tests across
test_wild_bootstrap, test_methodology_did, test_methodology_twfe,
test_conley_vcov, test_estimators_vcov_type, test_business_report,
test_replicate_weight_expansion, test_survey.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0 commit comments