Skip to content

Commit 789b128

Browse files
igerberclaude
andcommitted
Update Tutorial 02: Add pre-trends section and fix DGP usage
- Add Section 8 demonstrating base_period parameter for pre-treatment effects - Show varying vs universal base period comparison - Include interpretation guidance for parallel trends diagnostics - Fix DGP usage: remove redundant cohort column, use first_treat directly - Update all CS/SA/Bacon fit() calls from first_treat='cohort' to first_treat='first_treat' - Renumber sections 8-13 to 9-14 - Update TOC and summary with new pre-trends content Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent fd7f10b commit 789b128

1 file changed

Lines changed: 198 additions & 22 deletions

File tree

docs/tutorials/02_staggered_did.ipynb

Lines changed: 198 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,13 @@
2020
"5. Aggregating effects (simple, group, event-study)\n",
2121
"6. Bootstrap inference for valid standard errors\n",
2222
"7. Visualization\n",
23-
"8. **Sun-Abraham interaction-weighted estimator**\n",
24-
"9. **Comparing CS and SA as a robustness check**"
23+
"8. **Pre-treatment effects and parallel trends testing**\n",
24+
"9. Different control group options\n",
25+
"10. Handling anticipation effects\n",
26+
"11. Adding covariates\n",
27+
"12. Comparing with MultiPeriodDiD\n",
28+
"13. Sun-Abraham interaction-weighted estimator\n",
29+
"14. Comparing CS and SA as a robustness check"
2530
]
2631
},
2732
{
@@ -78,8 +83,7 @@
7883
" seed=42\n",
7984
")\n",
8085
"\n",
81-
"# Add a 'cohort' column that matches the old format (first_treat is already there)\n",
82-
"df['cohort'] = df['first_treat']\n",
86+
"# The DGP returns 'first_treat' column: 0 = never-treated, >0 = first treatment period\n",
8387
"\n",
8488
"print(f\"Dataset: {len(df)} observations, {df['unit'].nunique()} units, {df['period'].nunique()} periods\")\n",
8589
"df.head(10)"
@@ -92,9 +96,9 @@
9296
"outputs": [],
9397
"source": [
9498
"# Examine treatment timing\n",
95-
"cohort_summary = df.groupby('unit').agg({'cohort': 'first', 'treated': 'sum'}).reset_index()\n",
99+
"cohort_summary = df.groupby('unit').agg({'first_treat': 'first', 'treated': 'sum'}).reset_index()\n",
96100
"print(\"Treatment cohorts:\")\n",
97-
"print(cohort_summary.groupby('cohort').size())\n",
101+
"print(cohort_summary.groupby('first_treat').size())\n",
98102
"\n",
99103
"print(\"\\nTreatment adoption over time:\")\n",
100104
"print(df.groupby('period')['treated'].mean().round(3))"
@@ -165,7 +169,7 @@
165169
" outcome='outcome',\n",
166170
" unit='unit',\n",
167171
" time='period',\n",
168-
" first_treat='cohort' # Same as 'cohort' column - 0 means never-treated\n",
172+
" first_treat='first_treat' # 0 means never-treated\n",
169173
")\n",
170174
"\n",
171175
"# View the decomposition summary\n",
@@ -227,7 +231,7 @@
227231
" outcome=\"outcome\",\n",
228232
" unit=\"unit\",\n",
229233
" time=\"period\",\n",
230-
" first_treat=\"cohort\", # Column with first treatment period (0 = never treated)\n",
234+
" first_treat=\"first_treat\", # Column with first treatment period (0 = never treated)\n",
231235
" aggregate=\"all\" # Compute all aggregations (simple, event_study, group)\n",
232236
")\n",
233237
"\n",
@@ -352,7 +356,33 @@
352356
"execution_count": null,
353357
"metadata": {},
354358
"outputs": [],
355-
"source": "# Callaway-Sant'Anna with bootstrap inference\ncs_boot = CallawaySantAnna(\n control_group=\"never_treated\",\n n_bootstrap=499, # Number of bootstrap iterations\n bootstrap_weights='rademacher', # or 'mammen', 'webb'\n seed=42 # For reproducibility\n)\n\nresults_boot = cs_boot.fit(\n df,\n outcome=\"outcome\",\n unit=\"unit\",\n time=\"period\",\n first_treat=\"cohort\", # Column with first treatment period\n aggregate=\"event_study\" # Compute event study aggregation\n)\n\n# Access bootstrap results\nprint(\"Bootstrap Inference Results:\")\nprint(\"=\" * 60)\nprint(f\"\\nOverall ATT: {results_boot.overall_att:.4f}\")\nprint(f\"Bootstrap SE: {results_boot.bootstrap_results.overall_att_se:.4f}\")\nprint(f\"Bootstrap 95% CI: [{results_boot.bootstrap_results.overall_att_ci[0]:.4f}, \"\n f\"{results_boot.bootstrap_results.overall_att_ci[1]:.4f}]\")\nprint(f\"Bootstrap p-value: {results_boot.bootstrap_results.overall_att_p_value:.4f}\")"
359+
"source": [
360+
"# Callaway-Sant'Anna with bootstrap inference\n",
361+
"cs_boot = CallawaySantAnna(\n",
362+
" control_group=\"never_treated\",\n",
363+
" n_bootstrap=499, # Number of bootstrap iterations\n",
364+
" bootstrap_weights='rademacher', # or 'mammen', 'webb'\n",
365+
" seed=42 # For reproducibility\n",
366+
")\n",
367+
"\n",
368+
"results_boot = cs_boot.fit(\n",
369+
" df,\n",
370+
" outcome=\"outcome\",\n",
371+
" unit=\"unit\",\n",
372+
" time=\"period\",\n",
373+
" first_treat=\"first_treat\", # Column with first treatment period\n",
374+
" aggregate=\"event_study\" # Compute event study aggregation\n",
375+
")\n",
376+
"\n",
377+
"# Access bootstrap results\n",
378+
"print(\"Bootstrap Inference Results:\")\n",
379+
"print(\"=\" * 60)\n",
380+
"print(f\"\\nOverall ATT: {results_boot.overall_att:.4f}\")\n",
381+
"print(f\"Bootstrap SE: {results_boot.bootstrap_results.overall_att_se:.4f}\")\n",
382+
"print(f\"Bootstrap 95% CI: [{results_boot.bootstrap_results.overall_att_ci[0]:.4f}, \"\n",
383+
" f\"{results_boot.bootstrap_results.overall_att_ci[1]:.4f}]\")\n",
384+
"print(f\"Bootstrap p-value: {results_boot.bootstrap_results.overall_att_p_value:.4f}\")"
385+
]
356386
},
357387
{
358388
"cell_type": "code",
@@ -431,7 +461,148 @@
431461
"cell_type": "markdown",
432462
"metadata": {},
433463
"source": [
434-
"## 8. Different Control Group Options\n",
464+
"## 8. Pre-Treatment Effects and Parallel Trends Testing\n",
465+
"\n",
466+
"The Callaway-Sant'Anna estimator can compute **pre-treatment effects** ATT(g,t) for periods before treatment. These should be near zero if parallel trends holds.\n",
467+
"\n",
468+
"The `base_period` parameter controls how the reference period is selected:\n",
469+
"- `\"varying\"` (default): For pre-treatment periods, compares t to t-1 (consecutive comparisons)\n",
470+
"- `\"universal\"`: Always compares to g-1 (the period just before treatment)\n",
471+
"\n",
472+
"Both produce identical post-treatment effects; they differ only for pre-treatment diagnostics."
473+
]
474+
},
475+
{
476+
"cell_type": "code",
477+
"execution_count": null,
478+
"metadata": {},
479+
"outputs": [],
480+
"source": [
481+
"# CallawaySantAnna with explicit base_period for pre-treatment effects\n",
482+
"cs_pretrends = CallawaySantAnna(\n",
483+
" control_group=\"never_treated\",\n",
484+
" base_period=\"varying\" # Default: consecutive comparisons for pre-periods\n",
485+
")\n",
486+
"\n",
487+
"results_pretrends = cs_pretrends.fit(\n",
488+
" df,\n",
489+
" outcome=\"outcome\",\n",
490+
" unit=\"unit\",\n",
491+
" time=\"period\",\n",
492+
" first_treat=\"first_treat\",\n",
493+
" aggregate=\"event_study\"\n",
494+
")\n",
495+
"\n",
496+
"# The base_period is recorded in results\n",
497+
"print(f\"Base period method: {results_pretrends.base_period}\")"
498+
]
499+
},
500+
{
501+
"cell_type": "code",
502+
"execution_count": null,
503+
"metadata": {},
504+
"outputs": [],
505+
"source": [
506+
"# Examine pre-treatment effects (event time < 0)\n",
507+
"print(\"Pre-Treatment Effects (Parallel Trends Diagnostic):\")\n",
508+
"print(\"=\" * 65)\n",
509+
"print(f\"{'Event Time':>12} {'ATT':>10} {'SE':>10} {'95% CI':>25} {'Test'}\")\n",
510+
"print(\"-\" * 65)\n",
511+
"\n",
512+
"pre_period_effects = []\n",
513+
"for event_time in sorted(results_pretrends.event_study_effects.keys()):\n",
514+
" if event_time < 0:\n",
515+
" effects = results_pretrends.event_study_effects[event_time]\n",
516+
" ci = effects['conf_int']\n",
517+
" includes_zero = ci[0] <= 0 <= ci[1]\n",
518+
" marker = \"Pass\" if includes_zero else \"Fail\"\n",
519+
" pre_period_effects.append(effects['effect'])\n",
520+
" print(f\"{event_time:>12} {effects['effect']:>10.4f} {effects['se']:>10.4f} \"\n",
521+
" f\"[{ci[0]:>8.4f}, {ci[1]:>8.4f}] {marker}\")\n",
522+
"\n",
523+
"if pre_period_effects:\n",
524+
" print(f\"\\n-> All pre-treatment effects should be close to zero\")\n",
525+
" print(f\" Mean pre-treatment effect: {np.mean(pre_period_effects):.4f}\")\n",
526+
"else:\n",
527+
" print(\"No pre-treatment effects computed (insufficient pre-periods)\")"
528+
]
529+
},
530+
{
531+
"cell_type": "markdown",
532+
"metadata": {},
533+
"source": [
534+
"### Comparing Base Period Methods\n",
535+
"\n",
536+
"Let's compare the two base period methods to understand their difference:"
537+
]
538+
},
539+
{
540+
"cell_type": "code",
541+
"execution_count": null,
542+
"metadata": {},
543+
"outputs": [],
544+
"source": [
545+
"# Compare varying vs universal base period\n",
546+
"cs_universal = CallawaySantAnna(\n",
547+
" control_group=\"never_treated\",\n",
548+
" base_period=\"universal\" # Always use g-1 as base\n",
549+
")\n",
550+
"\n",
551+
"results_universal = cs_universal.fit(\n",
552+
" df,\n",
553+
" outcome=\"outcome\",\n",
554+
" unit=\"unit\",\n",
555+
" time=\"period\",\n",
556+
" first_treat=\"first_treat\",\n",
557+
" aggregate=\"event_study\"\n",
558+
")\n",
559+
"\n",
560+
"print(\"Pre-Treatment Effects: Varying vs Universal Base Period\")\n",
561+
"print(\"=\" * 70)\n",
562+
"print(f\"{'Event Time':>12} {'Varying':>12} {'Universal':>12} {'Difference':>12}\")\n",
563+
"print(\"-\" * 70)\n",
564+
"\n",
565+
"for event_time in sorted(results_pretrends.event_study_effects.keys()):\n",
566+
" if event_time < 0:\n",
567+
" varying_eff = results_pretrends.event_study_effects[event_time]['effect']\n",
568+
" universal_eff = results_universal.event_study_effects.get(event_time, {}).get('effect', np.nan)\n",
569+
" diff = varying_eff - universal_eff if not np.isnan(universal_eff) else np.nan\n",
570+
" print(f\"{event_time:>12} {varying_eff:>12.4f} {universal_eff:>12.4f} {diff:>12.4f}\")\n",
571+
"\n",
572+
"print(\"\\nNote: 'Varying' uses consecutive period comparisons (t vs t-1)\")\n",
573+
"print(\" 'Universal' compares all periods to g-1\")"
574+
]
575+
},
576+
{
577+
"cell_type": "markdown",
578+
"metadata": {},
579+
"source": [
580+
"### Interpreting Pre-Treatment Effects\n",
581+
"\n",
582+
"**What we're testing:**\n",
583+
"- Pre-treatment ATT(g,t) should be approximately zero if parallel trends holds\n",
584+
"- Significant non-zero pre-treatment effects suggest potential parallel trends violations\n",
585+
"\n",
586+
"**Key insights:**\n",
587+
"- Visual inspection in the event study plot shows pre-period coefficients\n",
588+
"- Formal tests: 95% CIs including zero is consistent with parallel trends\n",
589+
"- **Important caveat**: A \"passing\" test doesn't prove parallel trends—the test may lack power\n",
590+
"\n",
591+
"**When concerned about pre-trends:**\n",
592+
"- Add covariates for precision (Section 11)\n",
593+
"- Use `control_group=\"not_yet_treated\"` for more data (Section 9)\n",
594+
"- Apply Honest DiD sensitivity analysis to bound effects under violations (Tutorial 05)\n",
595+
"- Assess pre-trends test power using Tutorial 07\n",
596+
"\n",
597+
"For comprehensive parallel trends testing: **Tutorial 04**\n",
598+
"For pre-trends power analysis (Roth 2022): **Tutorial 07**"
599+
]
600+
},
601+
{
602+
"cell_type": "markdown",
603+
"metadata": {},
604+
"source": [
605+
"## 9. Different Control Group Options\n",
435606
"\n",
436607
"The CS estimator supports different control group specifications:\n",
437608
"- `\"never_treated\"`: Only use units that are never treated\n",
@@ -454,7 +625,7 @@
454625
" outcome=\"outcome\",\n",
455626
" unit=\"unit\",\n",
456627
" time=\"period\",\n",
457-
" first_treat=\"cohort\"\n",
628+
" first_treat=\"first_treat\"\n",
458629
")\n",
459630
"\n",
460631
"# Compare using overall_att/overall_se attributes\n",
@@ -469,7 +640,7 @@
469640
"cell_type": "markdown",
470641
"metadata": {},
471642
"source": [
472-
"## 9. Handling Anticipation Effects\n",
643+
"## 10. Handling Anticipation Effects\n",
473644
"\n",
474645
"If units start changing behavior before official treatment (anticipation), you can specify the anticipation period."
475646
]
@@ -491,7 +662,7 @@
491662
" outcome=\"outcome\",\n",
492663
" unit=\"unit\",\n",
493664
" time=\"period\",\n",
494-
" first_treat=\"cohort\"\n",
665+
" first_treat=\"first_treat\"\n",
495666
")\n",
496667
"\n",
497668
"print(f\"With anticipation=1: ATT = {results_antic.overall_att:.4f}\")"
@@ -501,7 +672,7 @@
501672
"cell_type": "markdown",
502673
"metadata": {},
503674
"source": [
504-
"## 10. Adding Covariates\n",
675+
"## 11. Adding Covariates\n",
505676
"\n",
506677
"You can include covariates to improve precision through outcome regression or propensity score methods."
507678
]
@@ -526,7 +697,7 @@
526697
" outcome=\"outcome\",\n",
527698
" unit=\"unit\",\n",
528699
" time=\"period\",\n",
529-
" first_treat=\"cohort\",\n",
700+
" first_treat=\"first_treat\",\n",
530701
" covariates=[\"size\", \"age\"]\n",
531702
")\n",
532703
"\n",
@@ -537,7 +708,7 @@
537708
"cell_type": "markdown",
538709
"metadata": {},
539710
"source": [
540-
"## 11. Comparing with MultiPeriodDiD\n",
711+
"## 12. Comparing with MultiPeriodDiD\n",
541712
"\n",
542713
"For comparison, here's how you would use `MultiPeriodDiD` which estimates period-specific effects. \n",
543714
"\n",
@@ -597,7 +768,7 @@
597768
"cell_type": "markdown",
598769
"metadata": {},
599770
"source": [
600-
"## 12. Sun-Abraham Interaction-Weighted Estimator\n",
771+
"## 13. Sun-Abraham Interaction-Weighted Estimator\n",
601772
"\n",
602773
"The Sun-Abraham (2021) estimator provides an alternative approach to staggered DiD. While Callaway-Sant'Anna aggregates 2x2 DiD comparisons, Sun-Abraham uses an **interaction-weighted regression** approach:\n",
603774
"\n",
@@ -630,7 +801,7 @@
630801
" outcome=\"outcome\",\n",
631802
" unit=\"unit\",\n",
632803
" time=\"period\",\n",
633-
" first_treat=\"cohort\" # Column with first treatment period (0 = never treated)\n",
804+
" first_treat=\"first_treat\" # Column with first treatment period (0 = never treated)\n",
634805
")\n",
635806
"\n",
636807
"# View summary\n",
@@ -664,7 +835,7 @@
664835
"cell_type": "markdown",
665836
"metadata": {},
666837
"source": [
667-
"## 13. Comparing CS and SA as a Robustness Check\n",
838+
"## 14. Comparing CS and SA as a Robustness Check\n",
668839
"\n",
669840
"Running both estimators provides a useful robustness check. When they agree, results are more credible."
670841
]
@@ -716,7 +887,7 @@
716887
" - Only using valid comparison groups\n",
717888
" - Properly aggregating effects\n",
718889
"4. **Sun-Abraham** provides an alternative approach using:\n",
719-
" - Interaction-weighted regression with cohort × relative-time indicators\n",
890+
" - Interaction-weighted regression with cohort x relative-time indicators\n",
720891
" - Different weighting scheme than CS\n",
721892
" - More efficient under homogeneous effects\n",
722893
"5. **Run both CS and SA** as a robustness check—when they agree, results are more credible\n",
@@ -728,7 +899,12 @@
728899
" - Use `n_bootstrap` parameter to enable multiplier bootstrap\n",
729900
" - Choose weight type: `'rademacher'`, `'mammen'`, or `'webb'`\n",
730901
" - Bootstrap results include SEs, CIs, and p-values for all aggregations\n",
731-
"8. **Control group choices** affect efficiency and assumptions:\n",
902+
"8. **Pre-treatment effects** provide parallel trends diagnostics:\n",
903+
" - Use `base_period=\"varying\"` for consecutive period comparisons\n",
904+
" - Pre-treatment ATT(g,t) should be near zero\n",
905+
" - 95% CIs including zero is consistent with parallel trends\n",
906+
" - See Tutorial 07 for pre-trends power analysis (Roth 2022)\n",
907+
"9. **Control group choices** affect efficiency and assumptions:\n",
732908
" - `\"never_treated\"`: Stronger parallel trends assumption\n",
733909
" - `\"not_yet_treated\"`: Weaker assumption, uses more data\n",
734910
"\n",
@@ -746,4 +922,4 @@
746922
},
747923
"nbformat": 4,
748924
"nbformat_minor": 4
749-
}
925+
}

0 commit comments

Comments
 (0)