Guard non-finite original_effect in compute_effect_bootstrap_stats

igerber · claude · igerber · commit 5449bbb361c8 · 2026-02-22T09:40:33.000-05:00
Return all-NaN inference when the point estimate is NaN/Inf, preventing
finite SE/CI/p-value from a valid bootstrap distribution when the
estimate itself is undefined. Adds parametrized regression tests and
updates REGISTRY.md bootstrap notes for both CallawaySantAnna and
SunAbraham sections.

Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/diff_diff/bootstrap_utils.py b/diff_diff/bootstrap_utils.py
@@ -234,6 +234,9 @@ def compute_effect_bootstrap_stats(
     p_value : float
         Bootstrap p-value.
     """
+    if not np.isfinite(original_effect):
+        return np.nan, (np.nan, np.nan), np.nan
+
     finite_mask = np.isfinite(boot_dist)
     n_valid = np.sum(finite_mask)
     n_total = len(boot_dist)
diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md
@@ -347,7 +347,7 @@ The multiplier bootstrap uses random weights w_i with E[w]=0 and Var(w)=1:
   - Parameter: `rank_deficient_action` controls behavior: "warn" (default), "error", or "silent"
 - Non-finite inference values:
   - Analytic SE: Returns NaN to signal invalid inference (not biased via zeroing)
-  - Bootstrap: Drops non-finite samples, warns, and adjusts p-value floor accordingly. SE, CI, and p-value are all NaN if SE is non-finite or zero (e.g., n_valid=1 with ddof=1, or identical samples)
+  - Bootstrap: Drops non-finite samples, warns, and adjusts p-value floor accordingly. SE, CI, and p-value are all NaN if the original point estimate is non-finite, SE is non-finite or zero (e.g., n_valid=1 with ddof=1, or identical samples)
   - Threshold: Returns NaN if <50% of bootstrap samples are valid
   - Per-effect t_stat: Uses NaN (not 0.0) when SE is non-finite or zero (consistent with overall_t_stat)
   - **Note**: This is a defensive enhancement over reference implementations (R's `did::att_gt`, Stata's `csdid`) which may error or produce unhandled inf/nan in edge cases without informative warnings
@@ -488,7 +488,7 @@ where weights ŵ_{g,e} = n_{g,e} / Σ_g n_{g,e} (sample share of cohort g at eve
 - NaN inference for undefined statistics:
   - t_stat: Uses NaN (not 0.0) when SE is non-finite or zero
   - Analytical inference: p_value and CI also NaN when t_stat is NaN (NaN propagates through `compute_p_value` and `compute_confidence_interval`)
-  - Bootstrap inference: p_value and CI computed from bootstrap distribution. SE, CI, and p-value are all NaN if SE is non-finite or zero, or if <50% of bootstrap samples are valid
+  - Bootstrap inference: p_value and CI computed from bootstrap distribution. SE, CI, and p-value are all NaN if the original point estimate is non-finite, SE is non-finite or zero, or if <50% of bootstrap samples are valid
   - Applies to overall ATT, per-effect event study, and aggregated event study
   - **Note**: Defensive enhancement matching CallawaySantAnna behavior; R's `fixest::sunab()` may produce Inf/NaN without warning
 - Inference distribution:
diff --git a/tests/test_bootstrap_utils.py b/tests/test_bootstrap_utils.py
@@ -57,6 +57,17 @@ def test_bootstrap_stats_mostly_valid_but_identical(self):
         assert np.isnan(ci[1])
         assert np.isnan(p_value)
 
+    @pytest.mark.parametrize("bad_value", [np.nan, np.inf, -np.inf])
+    def test_nonfinite_original_effect_with_finite_boot_dist(self, bad_value):
+        """Non-finite original_effect must return all-NaN even with finite boot_dist."""
+        boot_dist = np.arange(100.0)
+        se, ci, p_value = compute_effect_bootstrap_stats(
+            original_effect=bad_value, boot_dist=boot_dist
+        )
+        assert np.isnan(se)
+        assert np.isnan(ci[0]) and np.isnan(ci[1])
+        assert np.isnan(p_value)
+
     def test_bootstrap_stats_normal_case(self):
         """Normal case with varied values: all fields finite."""
         boot_dist = np.arange(100.0)