Skip to content

Commit ee5c5c5

Browse files
igerberclaude
andcommitted
PR-B: WooldridgeDiD (ETWFE) tracker promotion + methodology bundle
Closes the WooldridgeDiD (ETWFE) methodology-review-tracker promotion in METHODOLOGY_REVIEW.md (In Progress → Complete), following the primary-source review for Wooldridge (2025) merged in PR-A (#484). Adds two paper-driven implementation surfaces and extends R-parity goldens to the nonlinear paths. Implementation: - `aggregate(weights="cohort_share")` on WooldridgeDiDResults implements paper Eqs. 7.4 (simple-overall) and 7.6 (event-time, restricted to k>=0) cohort-share aggregation weights as an opt-in alternative to the default cell-count weighting (matching Stata `jwdid_estat`). Inference fields fail-closed to NaN with UserWarning per paper Section 7.5 conditional-on-shares semantics; raises on `survey_design` (design-consistent totals deferred); raises on `type ∈ {"group","calendar"}` (no paper closed-form); raises on bootstrap fits (no matching bootstrap variant). Closes TODO row 95. - `cohort_trends=True` on `WooldridgeDiD.__init__` adds linear `dg_i · t` cohort-specific trend interactions (paper Section 8 / Eq. 8.1) for the OLS path. Rejects on logit/poisson per paper Section 8 OLS scope; rejects on survey_design pending full-dummy/TSL validation; enforces per-cohort pre-period identification check (≥ 2 observed pre-periods per treated cohort). Auto-routes to full-dummy mode regardless of vcov_type. Closes the PR-A Requirements Checklist heterogeneous-trends gap. Tests: - `tests/test_methodology_wooldridge.py` extended with 6 paper-equation-numbered methodology classes (Theorem 3.1, Proposition 5.1, Section 6 event study, Section 7 aggregation paths, Section 8 heterogeneous trends, Section 10 unbalanced panels) + `TestW2025LibraryDeviations` consolidating 5 surviving deviations. Mirrors the HAD PR #473 precedent. - Two new R-parity surface classes (`TestWooldridgeParityRPoisson`, `TestWooldridgeParityRLogit`) lock the structural surface against R `etwfe(family=...)` log-link goldens. - 209 tests total (60 methodology + 149 R-parity + unit regressions). R Goldens: - `benchmarks/R/generate_wooldridge_golden.R` extended with Poisson + logit DGPs via R `etwfe`; augmented panel CSV retains the same seed-generated `y_pois` + `y_logit` columns for cross-language reproducibility. - `benchmarks/R/requirements.R` pins `etwfe >= 0.5.0`. Tracker promotion: - METHODOLOGY_REVIEW.md L52 status flip with merge date; detail section L583-605 rewritten to the Verified Components / Test Coverage / Corrections Made / Deviations / Outstanding Concerns template mirroring HAD / ContinuousDiD / DCDH. L27 example re-pointed; priority queue items #7-#10 renumbered to #6-#9. - REGISTRY.md `## WooldridgeDiD (ETWFE)` extended with `### Deviations from the paper / from R / library extensions` block consolidating 7 surviving deviations + opt-in notes for cohort_share + cohort_trends + survey rejection + bootstrap cohort_share rejection contracts. - CHANGELOG.md `[Unreleased]` `### Added` documents the new parameters, R-parity extension, and tracker flip. - `docs/methodology/papers/wooldridge-2025-review.md` Requirements Checklist + Gaps & Uncertainties items 1 + 11 marked `**Status:** Closed in PR-B`. - `docs/api/wooldridge_etwfe.rst` updated with weighting-scheme notes alongside the existing aggregation table. Second of two PRs for the WooldridgeDiD methodology-review-tracker promotion. PR-A merged at e416aed (#484). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent e416aed commit ee5c5c5

13 files changed

Lines changed: 2960 additions & 379 deletions

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Large diffs are not rendered by default.

METHODOLOGY_REVIEW.md

Lines changed: 37 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ A **Complete** entry has a documented review pass against the primary academic s
2424

2525
The catalog grew incrementally over several quarters, so formats vary across the existing Complete entries; the consistent invariant is that someone walked through the implementation against the academic source and captured the result here. New reviews going forward should aim for the fuller structure (Verified Components + Corrections Made + Deviations + dedicated methodology test file) used by the more recent entries.
2626

27-
**In Progress** entries have a REGISTRY.md section and unit-test coverage, but no formal walk-through has been captured here yet. The In Progress band is wide — some entries also have some combination of a paper review (primary or companion), a dedicated methodology test file, and R parity fixtures (e.g., WooldridgeDiD has a companion-paper review for Wooldridge (2023) plus unit tests but no primary-source review for Wooldridge (2025) and no dedicated methodology test file yet); others have only the REGISTRY entry and unit tests (e.g., PowerAnalysis). The "Documentation in place" sub-section enumerates what each entry already has; the "Outstanding for promotion" sub-section enumerates what's still needed to flip it to Complete.
27+
**In Progress** entries have a REGISTRY.md section and unit-test coverage, but no formal walk-through has been captured here yet. The In Progress band is wide — some entries also have some combination of a paper review (primary or companion), a dedicated methodology test file, and R parity fixtures (e.g., TROP has a recent paper review but no methodology test file or cross-language anchor yet); others have only the REGISTRY entry and unit tests (e.g., PowerAnalysis). The "Documentation in place" sub-section enumerates what each entry already has; the "Outstanding for promotion" sub-section enumerates what's still needed to flip it to Complete.
2828

2929
**Not Started** entries have neither a tracker walk-through nor an REGISTRY.md section. This tracker no longer carries any Not Started rows; new estimators are expected to enter as In Progress when their REGISTRY entry lands.
3030

@@ -49,7 +49,7 @@ The catalog grew incrementally over several quarters, so formats vary across the
4949
| StackedDiD | `stacked_did.py` | `stacked-did-weights` (Wing-Freedman-Hollingsworth code) | **Complete** | 2026-02-19 |
5050
| ImputationDiD | `imputation.py` | `didimputation` | **In Progress** ||
5151
| TwoStageDiD | `two_stage.py` | `did2s` | **In Progress** ||
52-
| WooldridgeDiD (ETWFE) | `wooldridge.py` | `etwfe` (R) / `jwdid` (Stata) | **In Progress** | |
52+
| WooldridgeDiD (ETWFE) | `wooldridge.py` | `etwfe` (R) / `jwdid` (Stata) | **Complete** | 2026-05-22 |
5353
| EfficientDiD | `efficient_did.py` | (no canonical R package) | **In Progress** ||
5454

5555
### Continuous & Universal-Treatment Estimators
@@ -587,21 +587,40 @@ and covariate-adjusted specifications.)
587587
|-------|-------|
588588
| Module | `wooldridge.py`, `wooldridge_results.py` |
589589
| Primary Reference | Wooldridge (2025), *Two-way fixed effects, the two-way Mundlak regression, and difference-in-differences estimators*, Empirical Economics 69(5), 2545–2587 |
590+
| Companion Reference | Wooldridge (2023), *Simple approaches to nonlinear difference-in-differences with panel data*, Econometrics Journal 26(3) (nonlinear extensions for logit/Poisson paths) |
590591
| R Reference | `etwfe` (McDermott 2023); Stata `jwdid` (Rios-Avila 2021) |
591-
| Status | **In Progress** |
592-
| Last Review | |
592+
| Status | **Complete** |
593+
| Last Review | 2026-05-22 |
593594

594-
**Documentation in place:**
595-
- REGISTRY.md section: `## WooldridgeDiD (ETWFE)` (saturated cohort×time interactions, OLS/logit/Poisson via IRLS, ASF-based ATT for nonlinear methods with delta-method SEs, four aggregations, survey support)
596-
- **Companion-paper review on file**: `docs/methodology/papers/wooldridge-2023-review.md` covers Wooldridge (2023) *Simple approaches to nonlinear difference-in-differences with panel data*, Econometrics Journal 26(3) — the nonlinear extension that the logit/Poisson paths implement (retrospective, merged PR #443 on 2026-05-13). A dedicated review for the primary ETWFE source (Wooldridge 2025, *Empirical Economics* 69(5)) is **not** yet on file.
597-
- Implementation: `tests/test_wooldridge.py` (covers OLS, logit, and Poisson paths plus the four aggregation types)
595+
**Verified Components:**
596+
- **Theorem 3.1 (Mundlak ≡ TWFE):** equivalence under non-singularity Eq. 3.3 — `tests/test_methodology_wooldridge.py::TestW2025Theorem31MundlakTWFEEquivalence`
597+
- **Proposition 5.1 / 5.2 (Imputation ≡ POLS five-way chain):** `TestW2025Proposition51ImputationPOLSEquivalence`
598+
- **Section 6 / Eqs. 6.1-6.5 event-study:** `TestW2025Section6EventStudy`
599+
- **Section 7 aggregation paths (Eqs. 7.2-7.4 + 7.6):** opt-in `weights="cohort_share"` on `aggregate()` recovers paper Eq. 7.4 simple-overall and Eq. 7.6 event-time hand-calc forms — `TestW2025Section7AggregationPaths`
600+
- **Section 8 / Eq. 8.1 heterogeneous cohort-specific trends:** `cohort_trends=True` adds `dg_i · t` interactions; recovers `tau` under heterogeneous-trends DGP — `TestW2025Section8HeterogeneousTrends`
601+
- **Section 10 unbalanced panels + time-varying covariates (Eq. 10.1-10.6):** `TestW2025Section10UnbalancedPanels`
598602

599-
**Outstanding for promotion:**
600-
- Dedicated paper review for the primary ETWFE source: write `docs/methodology/papers/wooldridge-2025-review.md` covering Wooldridge (2025) *Empirical Economics* 69(5), 2545–2587 (published version of the 2021 SSRN working paper / NBER WP 29154)
601-
- Dedicated `tests/test_methodology_wooldridge.py` with paper-equation-numbered Verified Components walk-through
602-
- R parity fixture against `etwfe` (and ideally Stata `jwdid`) covering OLS, logit, and Poisson paths
603-
- Verified Components for nonlinear-method ASF / delta-method SE invariants
604-
- "Corrections Made" listing
603+
**Test Coverage:**
604+
- `tests/test_methodology_wooldridge.py` — 60 tests across 6 paper-equation-numbered classes + `TestW2025LibraryDeviations` (5 surviving deviations locked) + `TestWooldridgeParityR` (12 vcov_type R-parity tests, OLS path) + `TestWooldridgeParityRPoisson` / `TestWooldridgeParityRLogit` (surface tests, with log-link goldens; numerical R-parity for nonlinear deferred per TODO row)
605+
- `tests/test_wooldridge.py` — 140 unit-level tests covering OLS / logit / Poisson + four aggregations + survey support + vcov_type variants + cluster/bootstrap interactions
606+
- `benchmarks/R/generate_wooldridge_golden.R` — clubSandwich + sandwich + etwfe goldens at `benchmarks/data/wooldridge_golden.json`
607+
608+
**Corrections Made:**
609+
- **PR #484 (PR-A):** Added primary-source review for Wooldridge (2025) at `docs/methodology/papers/wooldridge-2025-review.md` (771 lines). Documented the cohort-share aggregation deviation (Eqs. 7.2-7.4 simple-overall AND Eq. 7.6 event-time) and the Section 8 heterogeneous-trends gap. REGISTRY § Aggregations Note + TODO row 95 extended to cover both paths.
610+
- **PR-B (this PR):** Closed two paper gaps documented in PR-A:
611+
- **Opt-in cohort-share aggregation weighting** via `aggregate(weights="cohort_share")` on `WooldridgeDiDResults` (paper Eq. 7.4 simple-overall + Eq. 7.6 event-time). Default stays `weights="cell"` for `jwdid_estat` back-compat.
612+
- **Heterogeneous cohort trends** via `WooldridgeDiD(cohort_trends=True)` (paper Eq. 8.1; OLS path only; auto-routes to full-dummy mode regardless of `vcov_type` to keep math closure verified against existing R-parity goldens).
613+
- Extended R goldens to include `etwfe(family="poisson")` and `etwfe(family="logit")` log-link coefficients (surface tests in Python; numerical response-scale parity deferred to follow-up).
614+
615+
**Deviations from the paper / from R / library extensions:** See REGISTRY.md `## WooldridgeDiD (ETWFE)``### Deviations from the paper / from R / library extensions` block for the consolidated list (HC1 finite-sample factor, QMLE sandwich `(n-1)/(n-k)` term, nonlinear-vs-fixest direct QMLE, logit cohort+time additive dummies, anticipation + aggregation, cell-count default with opt-in cohort-share).
616+
617+
**Outstanding Concerns:**
618+
- **Stata `jwdid` golden values** (TODO row 97): Stata-side parity infrastructure deferred until Stata install is available; R `etwfe` side covered in PR-B Stage D.
619+
- **Response-scale APE / log-link bridge for Poisson + logit R parity** (new TODO row added in PR-B): direct cell-level numerical parity between diff-diff's response-scale ATT and R `etwfe` log-link coefficients requires either `emfx()`-based APE extraction on the R side or link-function inversion with baseline-mean adjustment.
620+
- **QMLE sandwich Stata-parity `qmle` weight type** (TODO row 94): diff-diff's `(G/(G-1)) × ((n-1)/(n-k))` is conservative vs Stata's `G/(G-1)` only; awaiting Stata golden values to confirm material difference.
621+
- **Repeated cross-sections** (paper p. 2581 → Deb et al. 2024): not in 2025 paper's main body; future PR.
622+
- **Treatment exit / non-absorbing treatment** (2023 paper Section 7.2 sketch): not in 2025 paper; future PR.
623+
- **`cohort_trends` polynomial extension** (`"quadratic"`, `"cubic"`): PR-B ships binary `True/False` for linear `dg_i · t`; forward-extensibility deferred.
605624

606625
---
607626

@@ -1319,11 +1338,10 @@ Promotion priority for the **In Progress** entries, ordered by what's blocked on
13191338

13201339
**Consolidation-pass-blocked (already has paper review or methodology file or R parity; mostly Verified Components walk-through):**
13211340

1322-
6. **WooldridgeDiD (ETWFE)** — companion-paper review (Wooldridge 2023 nonlinear extension) merged in PR #443; primary-source review for Wooldridge (2025) ETWFE not yet on file, and no dedicated methodology test file.
1323-
7. **TROP** — paper review recently merged (PR #443); needs methodology file and cross-language anchor (when paper-author reference becomes available).
1324-
8. **StaggeredTripleDifference** — shares the primary paper (Ortiz-Villavicencio & Sant'Anna 2025) with TripleDifference, but no dedicated paper review on file yet; needs R parity (R fixtures gitignored — tracked in TODO.md, PR #245).
1325-
9. **ConleySpatialHAC** — paper review + committed R `conleyreg` goldens; needs dedicated methodology test file + summary R-parity table in this tracker.
1326-
10. **Survey Data Support** — cross-cutting feature; promotion requires the per-estimator integration paths to be locked down first.
1341+
6. **TROP** — paper review recently merged (PR #443); needs methodology file and cross-language anchor (when paper-author reference becomes available).
1342+
7. **StaggeredTripleDifference** — shares the primary paper (Ortiz-Villavicencio & Sant'Anna 2025) with TripleDifference, but no dedicated paper review on file yet; needs R parity (R fixtures gitignored — tracked in TODO.md, PR #245).
1343+
8. **ConleySpatialHAC** — paper review + committed R `conleyreg` goldens; needs dedicated methodology test file + summary R-parity table in this tracker.
1344+
9. **Survey Data Support** — cross-cutting feature; promotion requires the per-estimator integration paths to be locked down first.
13271345

13281346
---
13291347

TODO.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,10 @@ Deferred items from PR reviews that were not addressed before merge.
9292
| HonestDiD Delta^RM: uses naive FLCI instead of paper's ARP conditional/hybrid confidence sets (Sections 3.2.1-3.2.2). ARP infrastructure exists but moment inequality transformation needs calibration. CIs are conservative (wider, valid coverage). | `honest_did.py` | #248 | Medium |
9393
| Replicate weight tests use Fay-like BRR perturbations (0.5/1.5), not true half-sample BRR. Add true BRR regressions per estimator family. Existing `test_survey_phase6.py` covers true BRR at the helper level. | `tests/test_replicate_weight_expansion.py` | #253 | Low |
9494
| WooldridgeDiD: QMLE sandwich uses `aweight` cluster-robust adjustment `(G/(G-1))*(n-1)/(n-k)` vs Stata's `G/(G-1)` only. Conservative (inflates SEs). Add `qmle` weight type if Stata golden values confirm material difference. | `wooldridge.py`, `linalg.py` | #216 | Medium |
95-
| WooldridgeDiD: aggregation weights use cell-level n_{g,t} counts on BOTH the simple-overall path (paper W2025 Eqs. 7.2-7.4) AND the event-time path (paper W2025 Eq. 7.6 cohort-share-by-exposure `ω̂_{ge} = N_g / (N_q + ··· + N_{T-e})`). Both `simple` and `event` aggregations reuse the same `_gt_weights` cell-count array. Add optional `weights="cohort_share"` parameter to `aggregate()` covering both paths. | `wooldridge_results.py` | #216 | Medium |
95+
<!-- CLOSED in PR-B (WooldridgeDiD methodology-review-tracker promotion): `WooldridgeDiDResults.aggregate(weights="cohort_share")` now exposes the paper W2025 Eq. 7.4 (simple) + Eq. 7.6 (event) cohort-share weights as an opt-in alternative to the default `weights="cell"` (which matches Stata `jwdid_estat`). See CHANGELOG `[Unreleased]` `### Added` for the surface contract. -->
96+
| WooldridgeDiD: response-scale APE / log-link coefficient bridge for R `etwfe(family="poisson")` + `etwfe(family="logit")` cell-level numerical parity. diff-diff `WooldridgeDiD(method="poisson"\|"logit")` returns ATT on the response scale (counterfactual μ_1 − μ_0 / p_1 − p_0 per paper W2023 ASF / APE framework); R `etwfe` returns the cell-level log-link coefficient. PR-B Stage D ships log-link goldens at `benchmarks/data/wooldridge_golden.json` and surface tests (fit completes + goldens well-formed); cell-level numerical parity requires either `emfx()`-based APE extraction on the R side or link-function inversion with baseline-mean adjustment. | `benchmarks/R/generate_wooldridge_golden.R`, `tests/test_methodology_wooldridge.py::TestWooldridgeParityRPoisson/TestWooldridgeParityRLogit` | PR-B follow-up | Medium |
97+
| WooldridgeDiD: design-consistent cohort totals for `aggregate(weights="cohort_share")` on survey-weighted fits. Current impl populates `_n_g_per_cohort` from `unit.nunique()` (raw counts); composing these unweighted cohort shares with the design-weighted ATTs targets a mixed estimand inconsistent with paper W2025 Section 7's design-population cohort-share form. PR-B Stage E fail-closes the surface (raises `ValueError` when `survey_design is not None`); the follow-up implements survey-weighted unit totals per cohort and re-enables the surface. | `wooldridge.py` `_n_g_per_cohort` population, `wooldridge_results.py::aggregate` survey gate | PR-B follow-up | Medium |
98+
| WooldridgeDiD: unconditional inference for `aggregate(weights="cohort_share")` accounting for sampling uncertainty in the cohort shares ω̂_g / ω̂_{ge} (paper W2025 Section 7.5). Current impl fail-closes the t-stat / p-value / conf-int fields to NaN under cohort-share aggregation because the analytical SE is conditional-on-shares. Proper APE/GMM-style aggregate inference (Wooldridge 2023 Section 4 framework) re-enables full inference. | `wooldridge_results.py::aggregate` cohort_share inference branch | PR-B follow-up | Medium |
9699
| WooldridgeDiD: optional *efficiency hint* (NOT a canonical-link violation per W2023 Prop 3.1) when method/outcome pairing is sub-optimal — e.g., `method="ols"` on binary data is consistent under QMLE, but `method="logit"` is typically more efficient. The original framing in this row as a "canonical link requirement" tied to Prop 3.1 was incorrect: Wooldridge (2023) Table 1 lists Gaussian/OLS for "any response" and logistic-Bernoulli for "binary OR fractional". A useful hint exists (efficiency), but should not be framed as a methodology violation. See PR #453 R1 review for the corrected reading. | `wooldridge.py` | #216 | Low |
97100
| WooldridgeDiD: Stata `jwdid` golden value tests — add R/Stata reference script and `TestReferenceValues` class. | `tests/test_wooldridge.py` | #216 | Medium |
98101
<!-- The PreTrendsPower R parity row (PR-C, 2026-05-19) and the four PR-A-tagged PreTrendsPower rows (CS/SA Σ_22 fidelity, helper `violation_weights`, custom-weight persistence, linear γ-unit MDV; resolved in PR-B 2026-05-18) are all closed — see CHANGELOG.md [Unreleased] Added/Changed/Fixed entries for the new behavior. -->

benchmarks/R/generate_wooldridge_golden.R

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,10 +25,12 @@ suppressPackageStartupMessages({
2525
library(clubSandwich)
2626
library(sandwich)
2727
library(jsonlite)
28+
library(etwfe)
2829
})
2930

3031
stopifnot(packageVersion("clubSandwich") >= "0.7.0")
3132
stopifnot(packageVersion("sandwich") >= "3.0.0")
33+
stopifnot(packageVersion("etwfe") >= "0.5.0")
3234

3335
panel_path <- file.path("benchmarks", "data", "wooldridge_test_panel.csv")
3436
out_path <- file.path("benchmarks", "data", "wooldridge_golden.json")
@@ -263,6 +265,94 @@ golden <- list(
263265
)
264266
)
265267

268+
# =============================================================================
269+
# Stage D (PR-B): Poisson + logit R parity via R `etwfe` package.
270+
#
271+
# Generates Poisson + logit outcomes from the existing panel structure with
272+
# a fixed seed, fits `etwfe(family="poisson")` and `etwfe(family="logit")`,
273+
# extracts per-cohort×time ATT coefficients + HC1 SEs, and saves the augmented
274+
# panel back to the same CSV so Python can load the same Y vectors.
275+
#
276+
# Tolerance (Python tests): point ATOL 1e-4, SE ATOL 5e-3. Loose because
277+
# QMLE optimizer paths differ (diff-diff uses direct IRLS via solve_logit /
278+
# solve_poisson; etwfe uses fixest's GLM backend). The HC1 sandwich differs
279+
# by an `(n-1)/(n-k_dm)` vs `(n-1)/(n-k_total)` factor (REGISTRY-documented).
280+
# =============================================================================
281+
282+
set.seed(20260522)
283+
# Treatment indicator: (cohort > 0) & (time >= cohort)
284+
D <- as.integer((df$cohort > 0) & (df$time >= df$cohort))
285+
286+
# Poisson outcome: lambda = exp(0.5 + 0.3 * D)
287+
df$y_pois <- rpois(nrow(df), lambda = exp(0.5 + 0.3 * D))
288+
# Logit outcome: p = plogis(0.0 + 0.8 * D)
289+
df$y_logit <- rbinom(nrow(df), size = 1L, prob = plogis(0.0 + 0.8 * D))
290+
291+
# Save augmented panel back so Python loads the same outcomes.
292+
write.csv(df, panel_path, row.names = FALSE)
293+
cat(sprintf("Wrote augmented panel with y_pois + y_logit to %s\n", panel_path))
294+
295+
# Fit etwfe(family="poisson")
296+
fit_pois <- etwfe(
297+
fml = y_pois ~ 1,
298+
tvar = "time",
299+
gvar = "cohort",
300+
data = df,
301+
family = "poisson",
302+
vcov = "HC1"
303+
)
304+
305+
# Extract per-cohort-time ATT coefs by name pattern ".Dtreat:cohort::{g}:time::{t}"
306+
extract_etwfe_coefs <- function(fit, gt_pairs) {
307+
coef_names_fit <- names(coef(fit))
308+
vcov_fit <- vcov(fit)
309+
se_diag <- sqrt(diag(vcov_fit))
310+
out <- list(att = numeric(length(gt_pairs)), se = numeric(length(gt_pairs)),
311+
gt_keys = list())
312+
for (i in seq_along(gt_pairs)) {
313+
g <- gt_pairs[[i]][1]
314+
t <- gt_pairs[[i]][2]
315+
nm <- sprintf(".Dtreat:cohort::%d:time::%d", g, t)
316+
pos <- match(nm, coef_names_fit)
317+
if (is.na(pos)) {
318+
# Cell may not be identified (etwfe drops collinear cells)
319+
out$att[i] <- NA_real_
320+
out$se[i] <- NA_real_
321+
} else {
322+
out$att[i] <- coef(fit)[pos]
323+
out$se[i] <- se_diag[pos]
324+
}
325+
out$gt_keys[[i]] <- list(g = g, t = t)
326+
}
327+
out
328+
}
329+
330+
pois_extracted <- extract_etwfe_coefs(fit_pois, gt_pairs)
331+
332+
# Fit etwfe(family="logit")
333+
fit_logit <- etwfe(
334+
fml = y_logit ~ 1,
335+
tvar = "time",
336+
gvar = "cohort",
337+
data = df,
338+
family = "logit",
339+
vcov = "HC1"
340+
)
341+
logit_extracted <- extract_etwfe_coefs(fit_logit, gt_pairs)
342+
343+
golden$poisson <- list(
344+
per_coef_att = unname(pois_extracted$att),
345+
per_coef_se = unname(pois_extracted$se),
346+
gt_keys = pois_extracted$gt_keys,
347+
etwfe_version = as.character(packageVersion("etwfe"))
348+
)
349+
golden$logit <- list(
350+
per_coef_att = unname(logit_extracted$att),
351+
per_coef_se = unname(logit_extracted$se),
352+
gt_keys = logit_extracted$gt_keys,
353+
etwfe_version = as.character(packageVersion("etwfe"))
354+
)
355+
266356
write_json(golden, out_path, auto_unbox = TRUE, pretty = TRUE, digits = 18)
267357
cat(sprintf("Wrote %s\n", out_path))
268358
cat(sprintf(" n_obs=%d, n_int=%d, n_units=%d\n",
@@ -272,3 +362,7 @@ cat(sprintf(" hc2_bm overall_se=%.10f, overall_dof=%.4f\n",
272362
overall_se_hc2_bm, overall_att_contrast_dof))
273363
cat(sprintf(" classical overall_se=%.10f\n", overall_se_classical))
274364
cat(sprintf(" hc2 overall_se=%.10f\n", overall_se_hc2))
365+
cat(sprintf(" poisson ATTs: %s\n",
366+
paste(round(pois_extracted$att, 4), collapse = ", ")))
367+
cat(sprintf(" logit ATTs: %s\n",
368+
paste(round(logit_extracted$att, 4), collapse = ", ")))

0 commit comments

Comments
 (0)