Skip to content

Commit 47b7446

Browse files
authored
Merge pull request #45 from igerber/claude/brainstorm-next-project-YjI2n
2 parents 026a031 + a735710 commit 47b7446

16 files changed

Lines changed: 3382 additions & 68 deletions

CHANGELOG.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,25 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
66
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
77

8+
## [1.2.0] - 2026-01-07
9+
10+
### Added
11+
- **Pre-Trends Power Analysis** (Roth 2022) for assessing informativeness of pre-trends tests
12+
- `PreTrendsPower` class for computing power and minimum detectable violation (MDV)
13+
- `PreTrendsPowerResults` dataclass with power, MDV, and test statistics
14+
- `PreTrendsPowerCurve` for power curves across violation magnitudes
15+
- `compute_pretrends_power()` and `compute_mdv()` convenience functions
16+
- Multiple violation types: `linear`, `constant`, `last_period`, `custom`
17+
- Integration with Honest DiD via `sensitivity_to_honest_did()` method
18+
- `plot_pretrends_power()` visualization for power curves
19+
- Tutorial notebook: `docs/tutorials/07_pretrends_power.ipynb`
20+
- Full API documentation: `docs/api/pretrends.rst`
21+
22+
**Reference**: Roth, J. (2022). "Pretest with Caution: Event-Study Estimates after Testing for Parallel Trends." *American Economic Review: Insights*, 4(3), 305-322.
23+
24+
### Fixed
25+
- **Reference period handling in pre-trends analysis**: Fixed bug where reference period was incorrectly assigned `avg_se` instead of being excluded from power calculations. Now properly excludes the omitted reference period from the joint Wald test.
26+
827
## [1.1.1] - 2026-01-06
928

1029
### Fixed
@@ -215,6 +234,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
215234
- `to_dict()` and `to_dataframe()` export methods
216235
- `is_significant` and `significance_stars` properties
217236

237+
[1.2.0]: https://github.com/igerber/diff-diff/compare/v1.1.1...v1.2.0
238+
[1.1.1]: https://github.com/igerber/diff-diff/compare/v1.1.0...v1.1.1
218239
[1.1.0]: https://github.com/igerber/diff-diff/compare/v1.0.2...v1.1.0
219240
[1.0.2]: https://github.com/igerber/diff-diff/compare/v1.0.1...v1.0.2
220241
[1.0.1]: https://github.com/igerber/diff-diff/compare/v1.0.0...v1.0.1

CLAUDE.md

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,8 @@ mypy diff_diff
7878
- `plot_honest_event_study` - Event study with honest confidence intervals
7979
- `plot_bacon` - Bacon decomposition scatter/bar plots (weights vs estimates by comparison type)
8080
- `plot_power_curve` - Power curve visualization (power vs effect size or sample size)
81-
- Works with MultiPeriodDiD, CallawaySantAnna, SunAbraham, HonestDiD, BaconDecomposition, PowerAnalysis, or DataFrames
81+
- `plot_pretrends_power` - Pre-trends test power curve (power vs violation magnitude)
82+
- Works with MultiPeriodDiD, CallawaySantAnna, SunAbraham, HonestDiD, BaconDecomposition, PowerAnalysis, PreTrendsPower, or DataFrames
8283

8384
- **`diff_diff/utils.py`** - Statistical utilities:
8485
- Robust/cluster standard errors (`compute_robust_se`)
@@ -110,6 +111,15 @@ mypy diff_diff
110111
- `simulate_power()` - Simulation-based power for any DiD estimator
111112
- `compute_mde()`, `compute_power()`, `compute_sample_size()` - Convenience functions
112113

114+
- **`diff_diff/pretrends.py`** - Pre-trends power analysis (Roth 2022):
115+
- `PreTrendsPower` - Main class for assessing informativeness of pre-trends tests
116+
- `PreTrendsPowerResults` - Results with power and minimum detectable violation (MDV)
117+
- `PreTrendsPowerCurve` - Power curve across violation magnitudes with plot method
118+
- `compute_pretrends_power()` - Convenience function for quick power computation
119+
- `compute_mdv()` - Convenience function for minimum detectable violation
120+
- Violation types: 'linear', 'constant', 'last_period', 'custom'
121+
- Integrates with HonestDiD for comprehensive sensitivity analysis
122+
113123
- **`diff_diff/prep.py`** - Data preparation utilities:
114124
- `generate_did_data` - Create synthetic data with known treatment effect
115125
- `make_treatment_indicator`, `make_post_indicator` - Create binary indicators
@@ -137,6 +147,7 @@ mypy diff_diff
137147
- `04_parallel_trends.ipynb` - Parallel trends testing and diagnostics
138148
- `05_honest_did.ipynb` - Honest DiD sensitivity analysis for parallel trends violations
139149
- `06_power_analysis.ipynb` - Power analysis for study design, MDE, simulation-based power
150+
- `07_pretrends_power.ipynb` - Pre-trends power analysis (Roth 2022), MDV, power curves
140151

141152
### Test Structure
142153

@@ -152,6 +163,7 @@ Tests mirror the source modules:
152163
- `tests/test_visualization.py` - Tests for plotting functions
153164
- `tests/test_honest_did.py` - Tests for Honest DiD sensitivity analysis
154165
- `tests/test_power.py` - Tests for power analysis
166+
- `tests/test_pretrends.py` - Tests for pre-trends power analysis
155167

156168
### Dependencies
157169

README.md

Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,7 @@ Signif. codes: '***' 0.001, '**' 0.01, '*' 0.05, '.' 0.1
7777
- **Goodman-Bacon decomposition**: Diagnose TWFE bias by decomposing into 2x2 comparisons
7878
- **Placebo tests**: Comprehensive diagnostics including fake timing, fake group, permutation, and leave-one-out tests
7979
- **Honest DiD sensitivity analysis**: Rambachan-Roth (2023) bounds and breakdown analysis for parallel trends violations
80+
- **Pre-trends power analysis**: Roth (2022) minimum detectable violation (MDV) and power curves for pre-trends tests
8081
- **Power analysis**: MDE, sample size, and power calculations for study design; simulation-based power for any estimator
8182
- **Data prep utilities**: Helper functions for common data preparation tasks
8283

@@ -1221,6 +1222,90 @@ plot_sensitivity(sensitivity, title="Sensitivity to Parallel Trends Violations")
12211222
plot_honest_event_study(event_results, honest_results)
12221223
```
12231224

1225+
### Pre-Trends Power Analysis (Roth 2022)
1226+
1227+
A passing pre-trends test doesn't mean parallel trends holds—it may just mean the test has low power. **Pre-Trends Power Analysis** (Roth 2022) answers: "What violations could my pre-trends test have detected?"
1228+
1229+
```python
1230+
from diff_diff import PreTrendsPower, MultiPeriodDiD
1231+
1232+
# First, fit an event study
1233+
did = MultiPeriodDiD()
1234+
event_results = did.fit(
1235+
data,
1236+
outcome='outcome',
1237+
treatment='treated',
1238+
time='period',
1239+
post_periods=[5, 6, 7, 8, 9]
1240+
)
1241+
1242+
# Analyze pre-trends test power
1243+
pt = PreTrendsPower(alpha=0.05, power=0.80)
1244+
power_results = pt.fit(event_results)
1245+
1246+
print(power_results.summary())
1247+
print(f"Minimum Detectable Violation (MDV): {power_results.mdv:.4f}")
1248+
print(f"Power to detect violations of size MDV: {power_results.power:.1%}")
1249+
```
1250+
1251+
**Key concepts:**
1252+
1253+
- **Minimum Detectable Violation (MDV)**: Smallest violation magnitude that would be detected with your target power (e.g., 80%). Passing the pre-trends test does NOT rule out violations up to this size.
1254+
- **Power**: Probability of detecting a violation of given size if it exists.
1255+
- **Violation types**: Linear trend, constant violation, last-period only, or custom patterns.
1256+
1257+
**Power curve visualization:**
1258+
1259+
```python
1260+
from diff_diff import plot_pretrends_power
1261+
1262+
# Generate power curve across violation magnitudes
1263+
curve = pt.power_curve(event_results)
1264+
1265+
# Plot the power curve
1266+
plot_pretrends_power(curve, title="Pre-Trends Test Power Curve")
1267+
1268+
# Or from the curve object directly
1269+
curve.plot()
1270+
```
1271+
1272+
**Different violation patterns:**
1273+
1274+
```python
1275+
# Linear trend violations (default) - most common assumption
1276+
pt_linear = PreTrendsPower(violation_type='linear')
1277+
1278+
# Constant violation in all pre-periods
1279+
pt_constant = PreTrendsPower(violation_type='constant')
1280+
1281+
# Violation only in the last pre-period (sharp break)
1282+
pt_last = PreTrendsPower(violation_type='last_period')
1283+
1284+
# Custom violation pattern
1285+
custom_weights = np.array([0.1, 0.3, 0.6]) # Increasing violations
1286+
pt_custom = PreTrendsPower(violation_type='custom', violation_weights=custom_weights)
1287+
```
1288+
1289+
**Combining with HonestDiD:**
1290+
1291+
Pre-trends power analysis and HonestDiD are complementary:
1292+
1. **Pre-trends power** tells you what the test could have detected
1293+
2. **HonestDiD** tells you how robust your results are to violations
1294+
1295+
```python
1296+
from diff_diff import HonestDiD, PreTrendsPower
1297+
1298+
# If MDV is large relative to your estimated effect, be cautious
1299+
pt = PreTrendsPower()
1300+
power_results = pt.fit(event_results)
1301+
sensitivity = pt.sensitivity_to_honest_did(event_results)
1302+
print(sensitivity['interpretation'])
1303+
1304+
# Use HonestDiD for robust inference
1305+
honest = HonestDiD(method='relative_magnitude', M=1.0)
1306+
honest_results = honest.fit(event_results)
1307+
```
1308+
12241309
### Placebo Tests
12251310

12261311
Placebo tests help validate the parallel trends assumption by checking whether effects appear where they shouldn't (before treatment or in untreated groups).
@@ -1645,6 +1730,81 @@ HonestDiD(
16451730
| `plot(ax)` | Plot sensitivity analysis |
16461731
| `to_dataframe()` | Convert to pandas DataFrame |
16471732

1733+
### PreTrendsPower
1734+
1735+
```python
1736+
PreTrendsPower(
1737+
alpha=0.05, # Significance level for pre-trends test
1738+
power=0.80, # Target power for MDV calculation
1739+
violation_type='linear', # 'linear', 'constant', 'last_period', 'custom'
1740+
violation_weights=None # Custom weights (required if violation_type='custom')
1741+
)
1742+
```
1743+
1744+
**fit() Parameters:**
1745+
1746+
| Parameter | Type | Description |
1747+
|-----------|------|-------------|
1748+
| `results` | MultiPeriodDiDResults | Results from event study |
1749+
| `M` | float | Specific violation magnitude to evaluate |
1750+
1751+
**Methods:**
1752+
1753+
| Method | Description |
1754+
|--------|-------------|
1755+
| `fit(results, M)` | Compute power analysis for given event study |
1756+
| `power_at(results, M)` | Compute power for specific violation magnitude |
1757+
| `power_curve(results, M_grid, n_points)` | Compute power across range of M values |
1758+
| `sensitivity_to_honest_did(results)` | Compare with HonestDiD analysis |
1759+
1760+
### PreTrendsPowerResults
1761+
1762+
**Attributes:**
1763+
1764+
| Attribute | Description |
1765+
|-----------|-------------|
1766+
| `power` | Power to detect the specified violation |
1767+
| `mdv` | Minimum detectable violation at target power |
1768+
| `violation_magnitude` | Violation magnitude (M) tested |
1769+
| `violation_type` | Type of violation pattern |
1770+
| `alpha` | Significance level |
1771+
| `target_power` | Target power level |
1772+
| `n_pre_periods` | Number of pre-treatment periods |
1773+
| `test_statistic` | Expected test statistic under violation |
1774+
| `critical_value` | Critical value for pre-trends test |
1775+
| `noncentrality` | Non-centrality parameter |
1776+
| `is_informative` | Heuristic check if test is informative |
1777+
| `power_adequate` | Whether power meets target |
1778+
1779+
**Methods:**
1780+
1781+
| Method | Description |
1782+
|--------|-------------|
1783+
| `summary()` | Get formatted summary string |
1784+
| `print_summary()` | Print summary to stdout |
1785+
| `to_dict()` | Convert to dictionary |
1786+
| `to_dataframe()` | Convert to pandas DataFrame |
1787+
1788+
### PreTrendsPowerCurve
1789+
1790+
**Attributes:**
1791+
1792+
| Attribute | Description |
1793+
|-----------|-------------|
1794+
| `M_values` | Array of violation magnitudes |
1795+
| `powers` | Array of power values |
1796+
| `mdv` | Minimum detectable violation |
1797+
| `alpha` | Significance level |
1798+
| `target_power` | Target power level |
1799+
| `violation_type` | Type of violation pattern |
1800+
1801+
**Methods:**
1802+
1803+
| Method | Description |
1804+
|--------|-------------|
1805+
| `plot(ax, show_mdv, show_target)` | Plot power curve |
1806+
| `to_dataframe()` | Convert to DataFrame with M and power columns |
1807+
16481808
### Data Preparation Functions
16491809

16501810
#### generate_did_data

ROADMAP.md

Lines changed: 4 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -6,19 +6,19 @@ For past changes and release history, see [CHANGELOG.md](CHANGELOG.md).
66

77
---
88

9-
## Current Status (v1.1.0)
9+
## Current Status (v1.2.0)
1010

1111
diff-diff is a **production-ready** DiD library with feature parity with R's `did` + `HonestDiD` ecosystem for core DiD analysis:
1212

1313
- **Core estimators**: Basic DiD, TWFE, MultiPeriod, Callaway-Sant'Anna, Sun-Abraham, Synthetic DiD
1414
- **Valid inference**: Robust SEs, cluster SEs, wild bootstrap, multiplier bootstrap
1515
- **Assumption diagnostics**: Parallel trends tests, placebo tests, Goodman-Bacon decomposition
16-
- **Sensitivity analysis**: Honest DiD (Rambachan-Roth)
16+
- **Sensitivity analysis**: Honest DiD (Rambachan-Roth), Pre-trends power analysis (Roth 2022)
1717
- **Study design**: Power analysis tools
1818

1919
---
2020

21-
## Near-Term Enhancements (v1.2)
21+
## Near-Term Enhancements (v1.3)
2222

2323
High-value additions building on our existing foundation.
2424

@@ -53,16 +53,6 @@ Extends DiD to settings requiring a third differencing dimension. Common DDD imp
5353

5454
**Reference**: [Ortiz-Villavicencio & Sant'Anna (2025)](https://arxiv.org/abs/2505.09942). *Working Paper*. R package: `triplediff`.
5555

56-
### Pre-Trends Power Analysis
57-
58-
Assess whether pre-trends tests have adequate power to detect meaningful parallel trends violations. Complements our Honest DiD implementation.
59-
60-
- Minimum detectable violation size for pre-trends tests
61-
- Visualization of power against various violation magnitudes
62-
- Integration with existing parallel trends diagnostics
63-
64-
**Reference**: [Roth (2022)](https://www.aeaweb.org/articles?id=10.1257/aeri.20210236). *AER: Insights*. R package: `pretrends`.
65-
6656
### Enhanced Visualization
6757

6858
- Synthetic control weight visualization (bar chart of unit weights)
@@ -71,7 +61,7 @@ Assess whether pre-trends tests have adequate power to detect meaningful paralle
7161

7262
---
7363

74-
## Medium-Term Enhancements (v1.3+)
64+
## Medium-Term Enhancements (v1.4+)
7565

7666
Extending diff-diff to handle more complex settings.
7767

diff_diff/__init__.py

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,13 @@
4545
compute_sample_size,
4646
simulate_power,
4747
)
48+
from diff_diff.pretrends import (
49+
PreTrendsPower,
50+
PreTrendsPowerCurve,
51+
PreTrendsPowerResults,
52+
compute_mdv,
53+
compute_pretrends_power,
54+
)
4855
from diff_diff.prep import (
4956
aggregate_to_cohorts,
5057
balance_panel,
@@ -87,10 +94,11 @@
8794
plot_group_effects,
8895
plot_honest_event_study,
8996
plot_power_curve,
97+
plot_pretrends_power,
9098
plot_sensitivity,
9199
)
92100

93-
__version__ = "1.1.1"
101+
__version__ = "1.2.0"
94102
__all__ = [
95103
# Estimators
96104
"DifferenceInDifferences",
@@ -164,4 +172,11 @@
164172
"compute_sample_size",
165173
"simulate_power",
166174
"plot_power_curve",
175+
# Pre-trends power analysis
176+
"PreTrendsPower",
177+
"PreTrendsPowerResults",
178+
"PreTrendsPowerCurve",
179+
"compute_pretrends_power",
180+
"compute_mdv",
181+
"plot_pretrends_power",
167182
]

0 commit comments

Comments
 (0)