Skip to content

Commit 420be21

Browse files
igerberclaude
andcommitted
R3 polish: drop 'matched to your data shape' overclaim on no-first_treat path
Codex R2 narrowed P1: the no-`first_treat` branch unconditionally emitted `DifferenceInDifferences().fit(...)` and labeled it "matched to your data shape". That overclaim doesn't hold for continuous-dose or heterogeneous- adoption designs without first_treat — DiD validates and rejects non- binary treatment / time at fit-time (`diff_diff/estimators.py:307-312`). Fix in agent_workflow.py: - With `first_treat`: keep CallawaySantAnna example, relabel as "your data has first_treat -> staggered structure; CS is canonical". - Without `first_treat`: keep DiD example as the simple 2x2 case but reframe the label to explicitly condition on that shape, name the alternative candidates (ContinuousDiD, HeterogeneousAdoptionDiD), and reference DiD's fit-time validation so the agent knows when to switch. No claim of universal match. New regression test `test_no_first_treat_step3_does_not_overclaim_match`: - asserts "matched to your data shape" is NOT in the no-first_treat script (negative) - asserts "2x2" and "substitute" appear (positive — substitution hint) - asserts ContinuousDiD and HeterogeneousAdoptionDiD remain enumerated in Step 2's routing patterns Total tests: 17 → 18 in test_agent_workflow.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 93dc359 commit 420be21

2 files changed

Lines changed: 55 additions & 6 deletions

File tree

diff_diff/agent_workflow.py

Lines changed: 26 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -145,23 +145,43 @@ def agent_workflow(
145145
)
146146
guide_call = 'diff_diff.get_llm_guide("autonomous")'
147147

148-
# Step 3 example: branch on first_treat presence. Staggered estimators
149-
# (CallawaySantAnna et al.) take `first_treat=` and do NOT accept
150-
# `treatment=` on fit(); simple 2x2 `DifferenceInDifferences.fit()`
151-
# takes `treatment=` and does not accept `first_treat=`. Emitting the
152-
# wrong shape produces a TypeError when the agent copy-pastes.
148+
# Step 3 example: branch on first_treat presence.
149+
# - With first_treat: a staggered structure is strongly implied;
150+
# show CallawaySantAnna with the matching `first_treat=` signature
151+
# (NB: staggered estimators reject `treatment=` on fit()).
152+
# - Without first_treat: the orchestrator does not inspect df, so it
153+
# CANNOT infer whether the panel is 2x2 binary vs continuous-dose
154+
# vs heterogeneous-adoption. Show a DifferenceInDifferences call
155+
# as the "simple 2x2" example and label it explicitly conditional
156+
# on that shape. The agent is expected to substitute the right
157+
# estimator class from Step 2's routing patterns when the data
158+
# is continuous / heterogeneous / non-binary-time
159+
# (DifferenceInDifferences.fit() rejects non-binary treatment/
160+
# time per diff_diff/estimators.py).
153161
if first_treat is not None:
154162
fit_example_class = "CallawaySantAnna"
155163
fit_example_kwargs = _join_kwargs(
156164
outcome=outcome, unit=unit, time=time, first_treat=first_treat
157165
)
158166
fit_example_call = f"diff_diff.CallawaySantAnna().fit(df, {fit_example_kwargs})"
167+
step3_label = (
168+
f"Step 3 - Fit. Your data has `first_treat` -> staggered structure; "
169+
f"{fit_example_class} is the canonical starting point:"
170+
)
159171
else:
160172
fit_example_class = "DifferenceInDifferences"
161173
fit_example_kwargs = _join_kwargs(
162174
outcome=outcome, unit=unit, time=time, treatment=treatment
163175
)
164176
fit_example_call = f"diff_diff.DifferenceInDifferences().fit(df, {fit_example_kwargs})"
177+
step3_label = (
178+
"Step 3 - Fit. Pick a candidate from Step 2's patterns based on your "
179+
"treatment/time shape. The example below shows the simple 2x2 case "
180+
"(binary treatment + binary time); substitute ContinuousDiD / "
181+
"HeterogeneousAdoptionDiD / etc. when your design is not 2x2 "
182+
"(DifferenceInDifferences.fit() validates and rejects non-binary "
183+
"treatment or time):"
184+
)
165185

166186
validation_calls = [
167187
"diff_diff.practitioner_next_steps(result)",
@@ -200,7 +220,7 @@ def agent_workflow(
200220
201221
{diagnostics_block}
202222
203-
Step 3 - Fit (matched to your data shape; {fit_example_class} shown):
223+
{step3_label}
204224
result = {fit_example_call}
205225
206226
Step 4 - Validate:

tests/test_agent_workflow.py

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -142,6 +142,35 @@ def test_first_treat_switches_step3_estimator(df):
142142
assert "treatment=" not in step3_lines[0]
143143

144144

145+
def test_no_first_treat_step3_does_not_overclaim_match(df):
146+
"""Without `first_treat`, the orchestrator cannot infer panel shape.
147+
148+
The Step 3 label must NOT claim the emitted DiD example is "matched to
149+
your data shape" — for continuous-dose or heterogeneous-adoption
150+
designs without first_treat, DifferenceInDifferences would reject at
151+
fit time. The label must instead frame the example as conditional on
152+
the simple 2x2 case and tell the agent to substitute the matching
153+
candidate from Step 2 for other shapes.
154+
"""
155+
out = diff_diff.agent_workflow(
156+
df,
157+
unit="firm_id",
158+
time="year",
159+
treatment="treated",
160+
outcome="logwage",
161+
verbose=False,
162+
)
163+
script = out["script"]
164+
# Negative: should not claim universal match.
165+
assert "matched to your data shape" not in script
166+
# Positive: must qualify with the 2x2 conditionality + substitution hint.
167+
assert "2x2" in script
168+
assert "substitute" in script.lower()
169+
# Other workflow patterns must remain enumerated so the agent can substitute.
170+
for name in ("ContinuousDiD", "HeterogeneousAdoptionDiD"):
171+
assert name in script, f"{name} routing pattern missing from Step 2 hints"
172+
173+
145174
def test_does_not_inspect_df():
146175
# Pure orchestrator: a structurally-empty DataFrame must still produce
147176
# the templated script (no df inspection happens).

0 commit comments

Comments
 (0)