Multi-Agent Debate and Agent Forest pipeline variants by tamara-kostova · Pull Request #2 · tamara-kostova/MultiAgentMedClassifier

tamara-kostova · 2026-04-21T12:51:36Z

System B - Multi-Agent Debate (--pipeline_mode debate): three MedGemma advocate agents argue on behalf of CNN, BiomedCLIP, and SAM3 outputs; a fourth MedGemma instance judges. Supports 1–3 rounds where advocates see the prior verdict and counter-argue. Replaces the verification + report tail.
System C - Agent Forest (--pipeline_mode forest): N role-specialized MedGemma instances (radiologist, conservative, emergency, differential) independently diagnose the scan; majority vote + confidence-weighted tiebreaking produces the consensus routing decision. Replaces the single triage node; all downstream nodes unchanged.
Two new research sweep families: agent_forest (N ∈ {1, 3, 4}) and debate_rounds (R ∈ {1, 2, 3}).
Two new analysis functions: forest_voting_analysis (dissent rate vs. accuracy) and debate_round_analysis (verdict stability vs. ECE).
evaluate.py extended to capture dissent_rate, vote_fraction, debate_rounds_completed, debate_round_changed per sample.

tamara-kostova added 6 commits April 21, 2026 14:37

initial deep research setup

40c8211

per-class breakdown analysis

9cb58a1

medgemma agreement, calibration per task

c4fd35f

Merge branch 'main' into deep-research

883da98

Merge branch 'main' into deep-research

e00be46

agent forest, agent debate

5c98f76

tamara-kostova changed the title ~~Deep Research~~ Multi-Agent Debate and Agent Forest pipeline variants Jun 2, 2026

tamara-kostova added 2 commits June 3, 2026 15:25

renaming nits

d57f518

Merge branch 'main' into deep-research

bbfbe54

Provide feedback