Skip to content

Commit b2dbf60

Browse files
igerberclaude
andcommitted
HAD Phase 4: trends_lin (Eq 17/18) + R-package end-to-end parity
Adds `trends_lin: bool = False` keyword-only kwarg to `HeterogeneousAdoptionDiD.fit(aggregate="event_study")`, `joint_pretrends_test`, and `joint_homogeneity_test`. Mirrors R `DIDHAD::did_had(..., trends_lin=TRUE)` (paper Eq 17 / Eq 18 / page 32 joint-Stute homogeneity-with-trends). Per-group linear-trend slope estimated as `Y[g, F-1] - Y[g, F-2]`; applied uniformly as `(t - base) × slope` adjustment that absorbs both the effect-side detrending and the placebo-side anchor swap. Requires F ≥ 3 (panel must contain F-2). The "consumed" placebo at our event-time `e=-2` is auto-dropped (R reduces max placebo lag by 1 with the same effect). Mutually exclusive with survey weighting; raises NotImplementedError. Bit-exact backcompat for trends_lin=False (default) across all existing surfaces. Adds end-to-end R-package parity test vs `DIDHAD` v2.0.0 (Credible-Answers/did_had, SHA `edc09197`). New generator `benchmarks/R/generate_did_had_golden.R` produces `benchmarks/data/did_had_golden.json` covering 3 paper-derived synthetic DGPs (Uniform, Beta(2,2), Beta(0.5,1)) × 5 method combinations (overall, event-study, placebo, yatchew, trends_lin). `tests/test_did_had_parity.py` asserts: - point estimate / SE / CI bounds at atol=1e-8 - closed-form Yatchew T-stat at atol=1e-10 after a documented `× G/(G-1)` finite-sample convention shift on all 24 (DGP × combo) cells. Two intentional convention deviations from R, documented in REGISTRY.md and the parity test docstring: (a) we report the bias-corrected point estimate (modern CCF 2018 convention); R's `Estimate` reports the conventional estimate with the bias-corrected CI separately. Our `att` matches R's CI midpoint. (b) Yatchew uses paper Appendix E's literal (1/G) variance denominator; R uses base-R `var()`'s (1/(N-1)) sample-variance convention. Ratio is exactly N/(N-1); both converge to the same asymptotic null. Yatchew on placebos with R's mean-independence null (`order=0`) is not exposed in our `yatchew_hr_test` and is skipped in the parity test; tracked as TODO follow-up. Stats: - 16 new direct trends_lin unit tests in test_had_pretests.py - 24 new R-parity tests in test_did_had_parity.py - 489 baseline + 16 + 24 = 529 tests pass, 0 regressions - Generator script: ~280 LoC; fixture: 117 KB Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent dbe4080 commit b2dbf60

10 files changed

Lines changed: 1402 additions & 8 deletions

File tree

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,12 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [Unreleased]
9+
10+
### Added
11+
- **HAD `trends_lin=True` linear-trend detrending mode** on `HeterogeneousAdoptionDiD.fit(aggregate="event_study")`, `joint_pretrends_test`, and `joint_homogeneity_test`. Mirrors R `DIDHAD::did_had(..., trends_lin=TRUE)` (paper Eq. 17 / Eq. 18 / page 32 joint-Stute homogeneity-with-trends). Per-group linear-trend slope estimated as `Y[g, F-1] - Y[g, F-2]` and applied as `(t - base) × slope` adjustment to per-event-time outcome evolutions. Requires F ≥ 3 (panel must contain F-2). The "consumed" placebo at our event-time `e=-2` is auto-dropped (R reduces max placebo lag by 1 with the same effect). Mutually exclusive with survey weighting (`survey_design` / `survey` / `weights`): raises `NotImplementedError` per `feedback_per_method_survey_element_contract` (weighted slope estimator not derived from paper; tracked in TODO.md as a follow-up). Bit-exact backcompat for `trends_lin=False` (default). Patch-level (additive keyword-only kwarg).
12+
- **HAD R-package end-to-end parity test** vs `DIDHAD` v2.0.0 (`Credible-Answers/did_had`). New parity fixture `benchmarks/data/did_had_golden.json` generated by `benchmarks/R/generate_did_had_golden.R` covers 3 paper-derived synthetic DGPs (Uniform, Beta(2,2), Beta(0.5,1)) × 5 method combinations (overall, event-study, placebo, yatchew, trends_lin). Python parity test `tests/test_did_had_parity.py` asserts point estimate / SE / CI bounds at `atol=1e-8` and Yatchew T-stat at `atol=1e-10` after a documented `× G/(G-1)` finite-sample convention shift. Two intentional convention deviations from R, documented in `docs/methodology/REGISTRY.md`: (a) we report the bias-corrected point estimate (modern CCF 2018 convention; R's `Estimate` column reports the conventional estimate with the bias-corrected CI separately — our `att` matches R's CI midpoint); (b) Yatchew uses paper Appendix E's literal (1/G) variance-denominator convention while R uses base-R `var()`'s (1/(N-1)) sample-variance convention (parity is bit-exact after the `× G/(G-1)` shift). Yatchew on placebos with R's mean-independence null (`order=0`) is not yet exposed in our `yatchew_hr_test` (we currently only support the linearity null) and is skipped in the parity test; tracked as TODO follow-up.
13+
814
## [3.3.1] - 2026-04-25
915

1016
### Changed

TODO.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,10 @@ Deferred items from PR reviews that were not addressed before merge.
101101
| `HeterogeneousAdoptionDiD` mass-point: `vcov_type in {"hc2", "hc2_bm"}` raises `NotImplementedError` pending a 2SLS-specific leverage derivation. The OLS leverage `x_i' (X'X)^{-1} x_i` is wrong for 2SLS; the correct finite-sample correction uses `x_i' (Z'X)^{-1} (...) (X'Z)^{-1} x_i`. Needs derivation plus an R / Stata (`ivreg2 small robust`) parity anchor. | `diff_diff/had.py::_fit_mass_point_2sls` | Phase 2a | Medium |
102102
| `HeterogeneousAdoptionDiD` survey-design API consolidation, **next minor bump**: drop the deprecated `survey=` and `weights=` kwargs on all 8 HAD surfaces (`HeterogeneousAdoptionDiD.fit`, `did_had_pretest_workflow`, `qug_test`, `stute_test`, `yatchew_hr_test`, `stute_joint_pretest`, `joint_pretrends_test`, `joint_homogeneity_test`); only `survey_design=` remains. Also fold the legacy back-end `weights=` paths (e.g. `_aggregate_unit_weights` ad-hoc routing) into the unified `_resolve_survey_for_fit`-driven path. The `_make_trivial_resolved` underscore alias on `survey.py` stays (one-line, harmless). DeprecationWarning ships in this PR; the removal PR is ~50 LoC of cleanup. | `diff_diff/had.py`, `diff_diff/had_pretests.py` | next minor bump | Medium |
103103
| `HeterogeneousAdoptionDiD` continuous paths: thread `cluster=` through `bias_corrected_local_linear` (Phase 1c's wrapper already supports cluster; Phase 2a ignores it with a `UserWarning` on the continuous path to keep scope tight). | `diff_diff/had.py`, `diff_diff/local_linear.py` | Phase 2a | Low |
104-
| `HeterogeneousAdoptionDiD` Eq 18 linear-trend detrending (Pierce-Schott style): the joint-Stute infrastructure shipped in the Phase 3 follow-up supports pre-trends (mean-indep) and post-homogeneity (linearity) nulls. The Pierce-Schott application (paper Section 5.2) uses a LINEAR-TREND detrending of pre-period outcomes before the joint CvM — `Y_{g,t} - Y_{g,t_anchor} - (t - t_anchor)*(Y_{g,t_anchor} - Y_{g,t_anchor-1})` — reaching p=0.51 on US-China tariff data. Extends `joint_pretrends_test` with a detrending mode or a separate Eq 18-specific helper. Deferred to Phase 4 replication harness (where the published p=0.51 serves as the parity anchor). | `diff_diff/had_pretests.py::joint_pretrends_test` | Phase 4 | Medium |
104+
| `HeterogeneousAdoptionDiD` Eq 17 / Eq 18 linear-trend detrending: SHIPPED in PR #389 (Phase 4 R-parity, 2026-04). Exposed as `trends_lin: bool = False` keyword-only kwarg on `HeterogeneousAdoptionDiD.fit(aggregate="event_study")`, `joint_pretrends_test`, `joint_homogeneity_test`. Mirrors R `DIDHAD::did_had(..., trends_lin=TRUE)`. Pierce-Schott published-number parity (paper p=0.51 / p=0.40) deferred indefinitely (LBD-restricted analysis panel); replaced by end-to-end R-package parity at `tests/test_did_had_parity.py`. | `diff_diff/had_pretests.py::joint_pretrends_test`, `diff_diff/had.py` | Phase 4 (shipped) | Done |
105+
| `HeterogeneousAdoptionDiD` `trends_lin × survey_design` follow-up: per-group linear-trend slope under survey weighting (weighted slope estimator? per-PSU slope?) is not derived from the paper. PR #389 raises `NotImplementedError` on the combination across all 3 trends_lin surfaces. If user demand emerges, derive the weighted variant and lift the gate. | `diff_diff/had.py::HeterogeneousAdoptionDiD.fit`, `diff_diff/had_pretests.py::joint_pretrends_test`, `diff_diff/had_pretests.py::joint_homogeneity_test` | follow-up | Low |
106+
| `HeterogeneousAdoptionDiD` `yatchew_hr_test(null="mean_independence")` mode: R `YatchewTest::yatchew_test(order=0)` fits `Y ~ 1` (intercept-only baseline) and tests mean-independence of Y from D; R's `DIDHAD::did_had(yatchew=TRUE)` uses this on placebo rows ("non-parametric pre-trends test"). Our `yatchew_hr_test` always fits `Y ~ D` (linearity null) — no `null=` parameter exposed. Adding the mean-independence mode would (a) give practitioners a more conventional pre-trends test surface, and (b) close the PR #389 R-parity feature gap on the placebo-Yatchew rows (currently skipped in `tests/test_did_had_parity.py::TestYatchewParity` because the two tests are not the same statistic). | `diff_diff/had_pretests.py::yatchew_hr_test` | follow-up | Medium |
107+
| `HeterogeneousAdoptionDiD` Stute family Stata-bridge parity: PR #389 R-parity covers the full HAD fit + Yatchew surfaces but skips Stute family (`stute_test`, `stute_joint_pretest`, `joint_pretrends_test`, `joint_homogeneity_test`) because no R `Stutetest` package exists publicly (chaisemartinPackages publishes only the Stata `stute_test` module; the paper cites a 2024c R Stutetest module that is not on GitHub or CRAN). Stata-bridge parity would add `benchmarks/stata/generate_stute_golden.do` + a Stata installation requirement. Low priority unless user demand emerges. | `benchmarks/stata/`, `tests/test_stute_test_parity.py` | follow-up | Low |
105108
| `HeterogeneousAdoptionDiD` Phase 3 Stute performance: Appendix D vectorized matrix form replaces the per-iteration OLS refit with a single precomputed `M = I - X(X'X)^{-1}X'` applied to `eps * eta`. Functionally identical, ~2x faster. Shipped literal-refit form in Phase 3 to match paper text and keep reviewer surface small. | `diff_diff/had_pretests.py::stute_test` | Phase 3 | Low |
106109
| `HeterogeneousAdoptionDiD` Phase 3 R-parity: Phase 3 ships coverage-rate validation on synthetic DGPs (not tight point parity against `chaisemartin::stute_test` / `yatchew_test`). Tight numerical parity requires aligning bootstrap seed semantics and `B` across numpy/R and is deferred. | `tests/test_had_pretests.py` | Phase 3 | Low |
107110
| `HeterogeneousAdoptionDiD` Phase 3 nprobust bandwidth for Stute: some Stute variants on continuous regressors use nprobust-style optimal bandwidth selection. Phase 3 uses OLS residuals from a 2-parameter linear fit (no bandwidth selection). nprobust integration is a future enhancement; not in paper scope. | `diff_diff/had_pretests.py::stute_test` | Phase 3 | Low |
Lines changed: 274 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,274 @@
1+
# Generate cross-language end-to-end parity fixture for HAD Phase 4
2+
# (PR #389 R-parity vs `Credible-Answers/did_had`).
3+
#
4+
# Purpose: validate Python `HeterogeneousAdoptionDiD.fit()` (overall,
5+
# event-study, placebo, yatchew, trends_lin) against R `DIDHAD::did_had()`
6+
# bit-exactly on shared input. The R package is the methodology source
7+
# of truth (the de Chaisemartin team wrote it); matching it within
8+
# `atol=1e-8` on point/SE/CI and `atol=1e-10` on closed-form Yatchew
9+
# T-stats is a strictly stronger correctness signal than reproducing the
10+
# paper's published Pierce-Schott numbers (which depend on a
11+
# LBD-restricted analysis panel).
12+
#
13+
# Usage:
14+
# Rscript benchmarks/R/generate_did_had_golden.R
15+
#
16+
# Output:
17+
# benchmarks/data/did_had_golden.json
18+
#
19+
# Phase 4 of HeterogeneousAdoptionDiD (de Chaisemartin et al. 2025).
20+
# Python test loader: tests/test_did_had_parity.py.
21+
#
22+
# Pin: DIDHAD == 2.0.0 (CRAN current as of 2026-04). YatchewTest >= 1.1.0.
23+
24+
library(jsonlite)
25+
library(DIDHAD)
26+
library(YatchewTest)
27+
28+
stopifnot(packageVersion("DIDHAD") >= "2.0.0")
29+
stopifnot(packageVersion("YatchewTest") >= "1.1.0")
30+
31+
# -------------------------------------------------------------------------
32+
# Panel builder: 5-period panel with F=4 (treatment onset at t=4).
33+
# Pre-periods: 1, 2, 3 (D=0). Post-periods: 4, 5 (D=fixed positive dose).
34+
# Y[g, t] = unit_fe[g] + trend[g] * (t - 1) + (dose[g] + dose[g]^2) * (t >= F) + noise
35+
# -------------------------------------------------------------------------
36+
37+
build_panel <- function(G, F_treat, T_periods, dose_draws, seed,
38+
unit_trend_sd = 0.05, noise_sd = 0.5) {
39+
set.seed(seed)
40+
n <- G * T_periods
41+
unit_fe <- rnorm(G, mean = 0, sd = 1.0)
42+
unit_trend <- rnorm(G, mean = 0.1, sd = unit_trend_sd)
43+
noise <- rnorm(n, mean = 0, sd = noise_sd)
44+
45+
rows <- vector("list", n)
46+
k <- 1
47+
for (g in seq_len(G)) {
48+
for (t in seq_len(T_periods)) {
49+
treated <- as.numeric(t >= F_treat)
50+
y <- unit_fe[g] + unit_trend[g] * (t - 1) +
51+
(dose_draws[g] + dose_draws[g]^2) * treated +
52+
noise[k]
53+
d_obs <- if (treated == 1) dose_draws[g] else 0.0
54+
# Use short column names (g, t, d, y) matching DIDHAD's tutorial
55+
# convention. The package has a data-masking issue when column
56+
# names alias the formal parameter names (e.g., column "time" with
57+
# `time = "time"` resolves to the column values inside dplyr's
58+
# `.data[[get("time")]]` lookup), so avoid that overlap upstream.
59+
rows[[k]] <- data.frame(
60+
g = g,
61+
t = t,
62+
y = y,
63+
d = d_obs,
64+
stringsAsFactors = FALSE
65+
)
66+
k <- k + 1
67+
}
68+
}
69+
do.call(rbind, rows)
70+
}
71+
72+
# DGP 1: D ~ Uniform(0, 1).
73+
dgp_uniform <- function(G = 200, F_treat = 4, T_periods = 5, seed = 20260426) {
74+
set.seed(seed * 2L + 1L)
75+
d <- runif(G, min = 0.0, max = 1.0)
76+
list(
77+
name = "uniform_G200_F4_T5",
78+
panel = build_panel(G, F_treat, T_periods, d, seed = seed),
79+
G = G, F_treat = F_treat, T_periods = T_periods,
80+
dose_distribution = "Uniform(0, 1)",
81+
seed = seed
82+
)
83+
}
84+
85+
# DGP 2: D ~ Beta(2, 2). Symmetric, bell-shaped on [0, 1].
86+
dgp_beta22 <- function(G = 200, F_treat = 4, T_periods = 5, seed = 20260426) {
87+
set.seed(seed * 2L + 2L)
88+
d <- rbeta(G, shape1 = 2, shape2 = 2)
89+
list(
90+
name = "beta22_G200_F4_T5",
91+
panel = build_panel(G, F_treat, T_periods, d, seed = seed),
92+
G = G, F_treat = F_treat, T_periods = T_periods,
93+
dose_distribution = "Beta(2, 2)",
94+
seed = seed
95+
)
96+
}
97+
98+
# DGP 3: D ~ Beta(0.5, 1). Heavy left tail (mass near 0); approximates
99+
# the empirical Pierce-Schott NTR-gap distribution where many industries
100+
# have small tariff gaps (boundary density vanishes property).
101+
dgp_boundary <- function(G = 200, F_treat = 4, T_periods = 5, seed = 20260426) {
102+
set.seed(seed * 2L + 3L)
103+
d <- rbeta(G, shape1 = 0.5, shape2 = 1.0)
104+
list(
105+
name = "boundary_G200_F4_T5",
106+
panel = build_panel(G, F_treat, T_periods, d, seed = seed),
107+
G = G, F_treat = F_treat, T_periods = T_periods,
108+
dose_distribution = "Beta(0.5, 1)",
109+
seed = seed
110+
)
111+
}
112+
113+
# -------------------------------------------------------------------------
114+
# Run did_had with given options and extract the standardized result
115+
# matrix. The R package returns a `did_had` S3 object whose `results`
116+
# slot has `resmat` (effects + placebos) and optionally `yatchew_test`.
117+
# -------------------------------------------------------------------------
118+
119+
run_did_had <- function(panel, effects = 1, placebo = 0,
120+
trends_lin = FALSE, yatchew = FALSE) {
121+
# graph_off=TRUE suppresses the auto-print of the event-study plot.
122+
fit <- did_had(
123+
df = panel,
124+
outcome = "y",
125+
group = "g",
126+
time = "t",
127+
treatment = "d",
128+
effects = effects,
129+
placebo = placebo,
130+
trends_lin = trends_lin,
131+
yatchew = yatchew,
132+
graph_off = TRUE
133+
)
134+
res <- fit$results
135+
resmat <- res$resmat
136+
out <- list(
137+
n_effects_actual = res$res.effects,
138+
n_placebo_actual = res$res.placebo,
139+
rownames = rownames(resmat),
140+
estimate = unname(resmat[, "Estimate"]),
141+
se = unname(resmat[, "SE"]),
142+
ci_lo = unname(resmat[, "LB.CI"]),
143+
ci_hi = unname(resmat[, "UB.CI"]),
144+
n_per_horizon = unname(as.integer(resmat[, "N"])),
145+
bw_per_horizon = unname(resmat[, "BW"]),
146+
n_within_bw = unname(as.integer(resmat[, "N.BW"])),
147+
qug_t = unname(resmat[, "T"]),
148+
qug_p = unname(resmat[, "p.val"]),
149+
event_id = unname(as.integer(resmat[, "ID"]))
150+
)
151+
if (yatchew) {
152+
yt <- res$yatchew_test
153+
out$yatchew_t <- unname(yt[, "T_hr"])
154+
out$yatchew_p <- unname(yt[, "p-value"])
155+
out$yatchew_n <- unname(as.integer(yt[, "N"]))
156+
# Capture sigma2 components for diagnostic comparison; the column
157+
# names contain unicode (sigma², σ²). Use positional indexing.
158+
out$yatchew_sigma2_lin <- unname(yt[, 1])
159+
out$yatchew_sigma2_diff <- unname(yt[, 2])
160+
}
161+
out
162+
}
163+
164+
# -------------------------------------------------------------------------
165+
# Build the DGP × method-combo fixture grid.
166+
# -------------------------------------------------------------------------
167+
168+
dgp_builders <- list(
169+
uniform = dgp_uniform,
170+
beta22 = dgp_beta22,
171+
boundary = dgp_boundary
172+
)
173+
174+
# Per-DGP method matrix. Each combo runs did_had with the named flags
175+
# and stores the resulting standardized resmat dict alongside the input
176+
# panel arrays. Python parity test loops over combos and asserts.
177+
#
178+
# Why effects=2/placebo=2: F=4 with T=5 leaves 2 post-period horizons
179+
# (t=4, 5) and 2 pre-period placebos (t=2, 1) without trends_lin. R
180+
# auto-truncates if requested > feasible. Under trends_lin, the
181+
# F-2 -> F-1 evolution is consumed by the slope estimator and R reduces
182+
# max placebo by 1 (so only placebo at t=1 survives).
183+
combos <- list(
184+
list(name = "overall_e1", effects = 1, placebo = 0,
185+
trends_lin = FALSE, yatchew = FALSE),
186+
list(name = "event_e2_p2", effects = 2, placebo = 2,
187+
trends_lin = FALSE, yatchew = FALSE),
188+
list(name = "event_e2_p2_yatchew", effects = 2, placebo = 2,
189+
trends_lin = FALSE, yatchew = TRUE),
190+
list(name = "event_e2_p2_trendslin", effects = 2, placebo = 2,
191+
trends_lin = TRUE, yatchew = FALSE),
192+
list(name = "event_e2_p2_yatchew_trendslin", effects = 2, placebo = 2,
193+
trends_lin = TRUE, yatchew = TRUE)
194+
)
195+
196+
fixtures <- list()
197+
for (dgp_name in names(dgp_builders)) {
198+
dgp <- dgp_builders[[dgp_name]]()
199+
panel <- dgp$panel
200+
combo_results <- list()
201+
for (combo in combos) {
202+
res <- run_did_had(
203+
panel = panel,
204+
effects = combo$effects,
205+
placebo = combo$placebo,
206+
trends_lin = combo$trends_lin,
207+
yatchew = combo$yatchew
208+
)
209+
combo_results[[combo$name]] <- list(
210+
effects = combo$effects,
211+
placebo = combo$placebo,
212+
trends_lin = combo$trends_lin,
213+
yatchew = combo$yatchew,
214+
result = res
215+
)
216+
}
217+
fixtures[[dgp$name]] <- list(
218+
name = dgp$name,
219+
G = dgp$G,
220+
F = dgp$F_treat,
221+
T = dgp$T_periods,
222+
dose_distribution = dgp$dose_distribution,
223+
seed = dgp$seed,
224+
panel = list(
225+
g = panel$g,
226+
t = panel$t,
227+
y = panel$y,
228+
d = panel$d
229+
),
230+
combos = combo_results
231+
)
232+
}
233+
234+
# -------------------------------------------------------------------------
235+
# Serialize
236+
# -------------------------------------------------------------------------
237+
238+
out <- list(
239+
metadata = list(
240+
description = paste(
241+
"DIDHAD::did_had end-to-end parity fixture for HAD Phase 4",
242+
"(PR #389 R-parity).",
243+
sep = " "
244+
),
245+
didhad_version = as.character(packageVersion("DIDHAD")),
246+
yatchewtest_version = as.character(packageVersion("YatchewTest")),
247+
nprobust_version = as.character(packageVersion("nprobust")),
248+
r_version = as.character(getRversion()),
249+
n_dgps = length(fixtures),
250+
n_combos_per_dgp = length(combos),
251+
point_atol = 1e-8,
252+
se_atol = 1e-8,
253+
ci_atol = 1e-8,
254+
yatchew_atol = 1e-10,
255+
qug_atol = 1e-12,
256+
notes = paste(
257+
"Three synthetic DGPs (Uniform, Beta(2,2), Beta(0.5,1) approximation",
258+
"of the empirical Pierce-Schott NTR-gap distribution). Each DGP runs",
259+
"5 method combos covering overall, event-study, placebo, yatchew,",
260+
"and trends_lin variants. Tolerances per the Phase 4 plan.",
261+
sep = " "
262+
)
263+
),
264+
fixtures = fixtures
265+
)
266+
267+
out_dir <- "benchmarks/data"
268+
if (!dir.exists(out_dir)) dir.create(out_dir, recursive = TRUE)
269+
out_path <- file.path(out_dir, "did_had_golden.json")
270+
write_json(out, path = out_path, digits = 17, auto_unbox = TRUE, null = "null")
271+
message(sprintf(
272+
"Wrote %d DGP fixtures (each with %d combos) to %s",
273+
length(fixtures), length(combos), out_path
274+
))

benchmarks/R/requirements.R

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,9 @@ required_packages <- c(
1313
"triplediff", # Ortiz-Villavicencio & Sant'Anna (2025) triple difference
1414
"survey", # Lumley (2004) complex survey analysis
1515
"estimatr", # Blair et al. (2019) weighted robust / IV SE (HAD mass-point parity)
16+
"DIDHAD", # de Chaisemartin et al. (2025) HAD estimator (HAD Phase 4 R-parity)
17+
"YatchewTest", # Yatchew (1997) linearity test (HAD yatchew R-parity)
18+
"nprobust", # Calonico-Cattaneo-Farrell local-linear (DIDHAD dependency)
1619

1720
# Utilities
1821
"jsonlite", # JSON output for Python interop

benchmarks/data/did_had_golden.json

Lines changed: 1 addition & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)