Skip to content

Production polish: installable, multi-provider, observable#1

Merged
Wenjix merged 10 commits into
mainfrom
production-polish
Jun 13, 2026
Merged

Production polish: installable, multi-provider, observable#1
Wenjix merged 10 commits into
mainfrom
production-polish

Conversation

@Wenjix

@Wenjix Wenjix commented Jun 12, 2026

Copy link
Copy Markdown
Owner

Summary

Production-ready polish across 12 workstreams to make Quorum usable by any Claude + Cursor user.

Distribution & Installation

  • Extracted skill scripts (fetch_findings.sh, validate_partition.py, post_synthesis.py) and rubric from the .skill zip into the repo source tree
  • Made package installable: removed "private": true, added "files" field, prepublishOnly script, build:skill to auto-rebuild the zip from source
  • Added .github/workflows/ci.yml: typecheck, test, and build:skill on push/PR
  • New quorum setup command: validates Node 22+, gh CLI, jq, python3, API keys, and skill installation
  • New quorum triage-pr command: runs synthesis + exploration end-to-end

Robustness & Flexibility

  • Replaced gh CLI dependency with direct GitHub REST API calls using GITHUB_TOKEN, with gh CLI fallback
  • New --provider anthropic flag and AnthropicAdapter for direct Anthropic Messages API usage
  • Configurable models via QUORUM_MODEL_HIGH/MED/LOW env vars
  • Retry with exponential backoff (3x) on transient errors in both adapters

Observability

  • Structured run logging: every run writes run.log.jsonl with timestamped dag/task lifecycle events
  • New quorum eval command: computes task success rate, cluster counts, avg duration from accumulated logs

Standalone Clustering

  • New quorum synthesize command: runs the full Phase 1 pipeline (fetch -> all-singletons clustering -> validate/score -> post) without needing an AI agent session

Note

Medium Risk
Touches PR comment posting, external API keys (Cursor/Anthropic/GitHub), and concurrent cloud agent execution; changes are mostly additive with fallbacks, but misconfiguration could spam or fail synthesis/exploration on real PRs.

Overview
Production-ready distribution and automation for Quorum: the package is npm-publishable (files, prepublishOnly, build:skill), skill assets and scripts live in-repo, and CI runs typecheck, tests, and skill rebuild on main PRs.

New CLI workflows: quorum synthesize runs Phase 1 (fetch bot findings → validate/score → optional PR synthesis comment) without AI clustering; quorum triage-pr chains synthesis + exploration; quorum setup checks toolchain and keys; quorum eval aggregates stats from run.log.jsonl.

Exploration backends: --provider anthropic / AnthropicAdapter (Messages API) alongside Cursor Cloud, with per-provider default models and QUORUM_MODEL_* overrides. Both adapters use retry with backoff on transient errors; Cursor plan-limit errors get clearer messaging.

GitHub: exploration comment upsert and synthesis recovery prefer GITHUB_TOKEN REST with gh fallback.

Observability: DAG runs append structured JSONL lifecycle logs; runner wires logging at task/dag boundaries.

Reviewed by Cursor Bugbot for commit e83fc6d. Bugbot is set up for automated code reviews on this repo. Configure here.


Open in Devin Review

…vable

- Extract skill scripts from zip into repo source tree
- Make package installable (drop private, add files field, prepublishOnly)
- Add build:skill to auto-rebuild quorum.skill from source
- Replace gh CLI with direct GitHub REST API (GITHUB_TOKEN), with gh fallback
- Add AnthropicAdapter for multi-provider support (--provider anthropic)
- Add retry with exponential backoff to all adapters
- Add configurable model names via QUORUM_MODEL_HIGH/MED/LOW env vars
- Add  for programmatic Phase 1 without AI clustering
- Add  for end-to-end synthesis + exploration
- Add  for prerequisite validation
- Add  for reviewer precision analysis from run logs
- Add structured run logging (run.log.jsonl per run)
- Add .github/workflows/ci.yml for CI on push/PR

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
cursor[bot]

This comment was marked as resolved.

- eval: scan .quorum/runs/ recursively for run.log.jsonl instead of
  assuming flat .quorum/log/ directory, so normal exploration run logs
  are discovered without manual copying
- setup: treat GITHUB_TOKEN as optional (ok: true) since upsert and
  synthesis gracefully fall back to gh CLI
- models: apply DAG overrides before env vars so QUORUM_MODEL_* env
  vars always take precedence over saved dag.json models

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
cursor[bot]

This comment was marked as resolved.

- runner: add missing logTaskEvent call for SKIPPED tasks so
  run.log.jsonl and quorum eval capture skipped tasks
- models: add provider-aware default model maps (ANTHROPIC_DEFAULT_MODELS
  vs CURSOR_DEFAULT_MODELS) so --provider anthropic uses valid
  Anthropic model IDs instead of Cursor-specific identifiers
- eval: distinguish clean finishes from degraded ones with parseError;
  add parseError to log output so eval reports accurate success rates

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
cursor[bot]

This comment was marked as resolved.

- Extract shared retry infrastructure into src/retry.ts; both adapters
  now delegate to withRetry() instead of duplicating retry loops
- synthesizeCommand respects --no-post flag; triage-pr --plan-only
  suppresses synthesis posting to keep plan-only truly read-only
- plan-only runs pass provider to initialRunState() so state.json and
  canvas artifacts show correct model IDs for --provider anthropic
- loadRunLogs scans only the explicit logDir when provided, falling
  back to .quorum/runs/ recursively only when no directory is given;
  fix misleading eval "No run logs found" message

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
cursor[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@Wenjix

Wenjix commented Jun 13, 2026

Copy link
Copy Markdown
Owner Author

🤝 Quorum — review synthesis

12 findings from bugbot, devin → 12 distinct issues. Sorted by reviewer quorum, then severity.

◽ 1/2 reviewer

  • [minor] <!-- devin-review-comment {"id": "BUG_pr-review-job-acbbd5d892a8431593540d535f51 — src/adapters/anthropic.ts:L85 · other · devin-1
  • [minor] Plan-only uses wrong models — src/cli.ts:L159 · other · bugbot-1
  • [minor] Synthesize ignores no-post flag — src/cli.ts:L410 · other · bugbot-2
  • [minor] Plan-only triage hits GitHub — src/cli.ts:L410 · other · bugbot-3
  • [minor] Setup fails without GitHub token — src/cli.ts:L538 · other · bugbot-4
  • [minor] <!-- devin-review-comment {"id": "BUG_pr-review-job-acbbd5d892a8431593540d535f51 — src/cli.ts:L583-L584 · other · devin-2
  • [minor] Eval scans wrong log directory — src/cli.ts:L584 · other · bugbot-5
  • [minor] Eval counts parse failures success — src/cli.ts:L591 · other · bugbot-6
  • [minor] Anthropic gets Cursor model IDs — src/dag.ts:L21 · other · bugbot-7
  • [minor] Model env vars overridden by DAG — src/dag.ts:L54 · other · bugbot-8
  • [minor] Eval reads wrong log directory — src/logging.ts:L145 · other · bugbot-9
  • [minor] Skipped tasks omit run logs — src/runner.ts:L42 · other · bugbot-10
clusters.scored.json (machine-readable, for agents)
{
  "generated_at": "2026-06-13T16:53:25+00:00",
  "totals": {
    "findings": 12,
    "clusters": 12,
    "reviewer_denominator": 2,
    "gate_split": []
  },
  "clusters": [
    {
      "cluster_id": "devin-1-solo",
      "member_ids": [
        "devin-1"
      ],
      "canonical_title": "<!-- devin-review-comment {\"id\": \"BUG_pr-review-job-acbbd5d892a8431593540d535f51",
      "canonical_description": "<!-- devin-review-comment {\"id\": \"BUG_pr-review-job-acbbd5d892a8431593540d535f515e97_0001\", \"file_path\": \"src/adapters/anthropic.ts\", \"start_line\": 85, \"end_line\": 85, \"side\": \"RIGHT\"} -->",
      "category": "other",
      "severity": "minor",
      "primary_location": {
        "file": "src/adapters/anthropic.ts",
        "start_line": 85,
        "end_line": 85
      },
      "match_type": "singleton",
      "match_confidence": 1,
      "cross_file": false,
      "quorum": 1,
      "reviewers": [
        "devin"
      ],
      "members": [
        {
          "id": "devin-1",
          "reviewer": "devin",
          "file": "src/adapters/anthropic.ts",
          "lines": [
            85,
            85
          ],
          "url": "https://github.com/Wenjix/quorum/pull/1#discussion_r3408278181",
          "comment_id": 3408278181,
          "node_id": "PRRC_kwDOS4LqW87LJjKl",
          "outdated": false
        }
      ]
    },
    {
      "cluster_id": "bugbot-1-solo",
      "member_ids": [
        "bugbot-1"
      ],
      "canonical_title": "Plan-only uses wrong models",
      "canonical_description": "Plan-only uses wrong models",
      "category": "other",
      "severity": "minor",
      "primary_location": {
        "file": "src/cli.ts",
        "start_line": 159,
        "end_line": 159
      },
      "match_type": "singleton",
      "match_confidence": 1,
      "cross_file": false,
      "quorum": 1,
      "reviewers": [
        "bugbot"
      ],
      "members": [
        {
          "id": "bugbot-1",
          "reviewer": "bugbot",
          "file": "src/cli.ts",
          "lines": [
            159,
            159
          ],
          "url": "https://github.com/Wenjix/quorum/pull/1#discussion_r3408252470",
          "comment_id": 3408252470,
          "node_id": "PRRC_kwDOS4LqW87LJc42",
          "outdated": false
        }
      ]
    },
    {
      "cluster_id": "bugbot-2-solo",
      "member_ids": [
        "bugbot-2"
      ],
      "canonical_title": "Synthesize ignores no-post flag",
      "canonical_description": "Synthesize ignores no-post flag",
      "category": "other",
      "severity": "minor",
      "primary_location": {
        "file": "src/cli.ts",
        "start_line": 410,
        "end_line": 410
      },
      "match_type": "singleton",
      "match_confidence": 1,
      "cross_file": false,
      "quorum": 1,
      "reviewers": [
        "bugbot"
      ],
      "members": [
        {
          "id": "bugbot-2",
          "reviewer": "bugbot",
          "file": "src/cli.ts",
          "lines": [
            410,
            410
          ],
          "url": "https://github.com/Wenjix/quorum/pull/1#discussion_r3408252466",
          "comment_id": 3408252466,
          "node_id": "PRRC_kwDOS4LqW87LJc4y",
          "outdated": false
        }
      ]
    },
    {
      "cluster_id": "bugbot-3-solo",
      "member_ids": [
        "bugbot-3"
      ],
      "canonical_title": "Plan-only triage hits GitHub",
      "canonical_description": "Plan-only triage hits GitHub",
      "category": "other",
      "severity": "minor",
      "primary_location": {
        "file": "src/cli.ts",
        "start_line": 410,
        "end_line": 410
      },
      "match_type": "singleton",
      "match_confidence": 1,
      "cross_file": false,
      "quorum": 1,
      "reviewers": [
        "bugbot"
      ],
      "members": [
        {
          "id": "bugbot-3",
          "reviewer": "bugbot",
          "file": "src/cli.ts",
          "lines": [
            410,
            410
          ],
          "url": "https://github.com/Wenjix/quorum/pull/1#discussion_r3408252472",
          "comment_id": 3408252472,
          "node_id": "PRRC_kwDOS4LqW87LJc44",
          "outdated": false
        }
      ]
    },
    {
      "cluster_id": "bugbot-4-solo",
      "member_ids": [
        "bugbot-4"
      ],
      "canonical_title": "Setup fails without GitHub token",
      "canonical_description": "Setup fails without GitHub token",
      "category": "other",
      "severity": "minor",
      "primary_location": {
        "file": "src/cli.ts",
        "start_line": 538,
        "end_line": 538
      },
      "match_type": "singleton",
      "match_confidence": 1,
      "cross_file": false,
      "quorum": 1,
      "reviewers": [
        "bugbot"
      ],
      "members": [
        {
          "id": "bugbot-4",
          "reviewer": "bugbot",
          "file": "src/cli.ts",
          "lines": [
            538,
            538
          ],
          "url": "https://github.com/Wenjix/quorum/pull/1#discussion_r3406732544",
          "comment_id": 3406732544,
          "node_id": "PRRC_kwDOS4LqW87LDp0A",
          "outdated": false
        }
      ]
    },
    {
      "cluster_id": "devin-2-solo",
      "member_ids": [
        "devin-2"
      ],
      "canonical_title": "<!-- devin-review-comment {\"id\": \"BUG_pr-review-job-acbbd5d892a8431593540d535f51",
      "canonical_description": "<!-- devin-review-comment {\"id\": \"BUG_pr-review-job-acbbd5d892a8431593540d535f515e97_0002\", \"file_path\": \"src/cli.ts\", \"start_line\": 583, \"end_line\": 584, \"side\": \"RIGHT\"} -->",
      "category": "other",
      "severity": "minor",
      "primary_location": {
        "file": "src/cli.ts",
        "start_line": 583,
        "end_line": 584
      },
      "match_type": "singleton",
      "match_confidence": 1,
      "cross_file": false,
      "quorum": 1,
      "reviewers": [
        "devin"
      ],
      "members": [
        {
          "id": "devin-2",
          "reviewer": "devin",
          "file": "src/cli.ts",
          "lines": [
            583,
            584
          ],
          "url": "https://github.com/Wenjix/quorum/pull/1#discussion_r3408278205",
          "comment_id": 3408278205,
          "node_id": "PRRC_kwDOS4LqW87LJjK9",
          "outdated": false
        }
      ]
    },
    {
      "cluster_id": "bugbot-5-solo",
      "member_ids": [
        "bugbot-5"
      ],
      "canonical_title": "Eval scans wrong log directory",
      "canonical_description": "Eval scans wrong log directory",
      "category": "other",
      "severity": "minor",
      "primary_location": {
        "file": "src/cli.ts",
        "start_line": 584,
        "end_line": 584
      },
      "match_type": "singleton",
      "match_confidence": 1,
      "cross_file": false,
      "quorum": 1,
      "reviewers": [
        "bugbot"
      ],
      "members": [
        {
          "id": "bugbot-5",
          "reviewer": "bugbot",
          "file": "src/cli.ts",
          "lines": [
            584,
            584
          ],
          "url": "https://github.com/Wenjix/quorum/pull/1#discussion_r3408273252",
          "comment_id": 3408273252,
          "node_id": "PRRC_kwDOS4LqW87LJh9k",
          "outdated": false
        }
      ]
    },
    {
      "cluster_id": "bugbot-6-solo",
      "member_ids": [
        "bugbot-6"
      ],
      "canonical_title": "Eval counts parse failures success",
      "canonical_description": "Eval counts parse failures success",
      "category": "other",
      "severity": "minor",
      "primary_location": {
        "file": "src/cli.ts",
        "start_line": 591,
        "end_line": 591
      },
      "match_type": "singleton",
      "match_confidence": 1,
      "cross_file": false,
      "quorum": 1,
      "reviewers": [
        "bugbot"
      ],
      "members": [
        {
          "id": "bugbot-6",
          "reviewer": "bugbot",
          "file": "src/cli.ts",
          "lines": [
            591,
            591
          ],
          "url": "https://github.com/Wenjix/quorum/pull/1#discussion_r3407408623",
          "comment_id": 3407408623,
          "node_id": "PRRC_kwDOS4LqW87LGO3v",
          "outdated": true
        }
      ]
    },
    {
      "cluster_id": "bugbot-7-solo",
      "member_ids": [
        "bugbot-7"
      ],
      "canonical_title": "Anthropic gets Cursor model IDs",
      "canonical_description": "Anthropic gets Cursor model IDs",
      "category": "other",
      "severity": "minor",
      "primary_location": {
        "file": "src/dag.ts",
        "start_line": 21,
        "end_line": 21
      },
      "match_type": "singleton",
      "match_confidence": 1,
      "cross_file": false,
      "quorum": 1,
      "reviewers": [
        "bugbot"
      ],
      "members": [
        {
          "id": "bugbot-7",
          "reviewer": "bugbot",
          "file": "src/dag.ts",
          "lines": [
            21,
            21
          ],
          "url": "https://github.com/Wenjix/quorum/pull/1#discussion_r3407408621",
          "comment_id": 3407408621,
          "node_id": "PRRC_kwDOS4LqW87LGO3t",
          "outdated": false
        }
      ]
    },
    {
      "cluster_id": "bugbot-8-solo",
      "member_ids": [
        "bugbot-8"
      ],
      "canonical_title": "Model env vars overridden by DAG",
      "canonical_description": "Model env vars overridden by DAG",
      "category": "other",
      "severity": "minor",
      "primary_location": {
        "file": "src/dag.ts",
        "start_line": 54,
        "end_line": 54
      },
      "match_type": "singleton",
      "match_confidence": 1,
      "cross_file": false,
      "quorum": 1,
      "reviewers": [
        "bugbot"
      ],
      "members": [
        {
          "id": "bugbot-8",
          "reviewer": "bugbot",
          "file": "src/dag.ts",
          "lines": [
            54,
            54
          ],
          "url": "https://github.com/Wenjix/quorum/pull/1#discussion_r3406732549",
          "comment_id": 3406732549,
          "node_id": "PRRC_kwDOS4LqW87LDp0F",
          "outdated": false
        }
      ]
    },
    {
      "cluster_id": "bugbot-9-solo",
      "member_ids": [
        "bugbot-9"
      ],
      "canonical_title": "Eval reads wrong log directory",
      "canonical_description": "Eval reads wrong log directory",
      "category": "other",
      "severity": "minor",
      "primary_location": {
        "file": "src/logging.ts",
        "start_line": 145,
        "end_line": 145
      },
      "match_type": "singleton",
      "match_confidence": 1,
      "cross_file": false,
      "quorum": 1,
      "reviewers": [
        "bugbot"
      ],
      "members": [
        {
          "id": "bugbot-9",
          "reviewer": "bugbot",
          "file": "src/logging.ts",
          "lines": [
            145,
            145
          ],
          "url": "https://github.com/Wenjix/quorum/pull/1#discussion_r3406732539",
          "comment_id": 3406732539,
          "node_id": "PRRC_kwDOS4LqW87LDpz7",
          "outdated": false
        }
      ]
    },
    {
      "cluster_id": "bugbot-10-solo",
      "member_ids": [
        "bugbot-10"
      ],
      "canonical_title": "Skipped tasks omit run logs",
      "canonical_description": "Skipped tasks omit run logs",
      "category": "other",
      "severity": "minor",
      "primary_location": {
        "file": "src/runner.ts",
        "start_line": 42,
        "end_line": 42
      },
      "match_type": "singleton",
      "match_confidence": 1,
      "cross_file": false,
      "quorum": 1,
      "reviewers": [
        "bugbot"
      ],
      "members": [
        {
          "id": "bugbot-10",
          "reviewer": "bugbot",
          "file": "src/runner.ts",
          "lines": [
            42,
            42
          ],
          "url": "https://github.com/Wenjix/quorum/pull/1#discussion_r3407408620",
          "comment_id": 3407408620,
          "node_id": "PRRC_kwDOS4LqW87LGO3s",
          "outdated": false
        }
      ]
    }
  ]
}

Quorum · PR #1 · generated 2026-06-13T16:53:25+00:00

@Wenjix

Wenjix commented Jun 13, 2026

Copy link
Copy Markdown
Owner Author

Quorum Exploration

Explored 6 cluster(s) from #1.

devin-1-solo: <!-- devin-review-comment {"id": "BUG_pr-review-job-acbbd5d892a8431593540d535f51

Quorum 1/2 | severity minor | category other | src/adapters/anthropic.ts:L85

Warnings:

  • pattern-sweep:devin-1-solo ended with status ERROR: [validation_error] Upgrade to Ultra for more Cloud Agents: You've reached the limit for your current plan. Upgrade to Ultra to run more Cloud Agents simultaneously.

Root Cause

Summary: A refactor left AnthropicAdapter with dead parse-based status logic, so parse outcome cannot influence TaskExecutionResult.status.
Mechanism: In AnthropicAdapter, extractMarkedJson(responseText) is called, but line 85 sets status to "finished" in both ternary branches; therefore parse success/failure has zero effect. Downstream, runner logic treats result.status as the authoritative task success signal and separately records parse failures as parseError, so the adapter-side parse check is effectively inert and contract-confusing.
Missing invariant: There should be a single explicit contract for status mapping: either adapters must convert unparseable model output into status: "error", or adapters must never parse and always leave parse validation to runner; mixing both without a strict rule creates dead code and ambiguous semantics.
Confidence: high
Evidence:

  • file: src/adapters/anthropic.ts | lines: L84-L85 | note: parsed is computed, but ternary returns "finished" on both branches, so condition is dead.
  • file: src/runner.ts | lines: L144-L155 | note: Runner maps non-finished statuses to task ERROR, and separately parses resultText to set parseError, showing status and parse are distinct downstream concerns.
  • file: src/types.ts | lines: L117-L124 | note: TaskExecutionResult.status allows finished|error|cancelled, implying adapters are expected to provide meaningful status distinctions.
    Follow-ups:
  • Should invalid/missing QUORUM JSON be treated as hard task failure (error) or as degraded success (FINISHED with parseError) across all adapters?

Pattern Sweep

No structured pattern-sweep result.

bugbot-1-solo: Plan-only uses wrong models

Quorum 1/2 | severity minor | category other | src/cli.ts:L159

Plan-only uses wrong models

Warnings:

  • root-cause:bugbot-1-solo ended with status ERROR: [validation_error] Upgrade to Ultra for more Cloud Agents: You've reached the limit for your current plan. Upgrade to Ultra to run more Cloud Agents simultaneously.
  • pattern-sweep:bugbot-1-solo ended with status SKIPPED: Skipped because upstream task(s) failed: root-cause:bugbot-1-solo

Root Cause

No structured root-cause result.

Pattern Sweep

No structured pattern-sweep result.

bugbot-2-solo: Synthesize ignores no-post flag

Quorum 1/2 | severity minor | category other | src/cli.ts:L410

Synthesize ignores no-post flag

Warnings:

  • root-cause:bugbot-2-solo ended with status ERROR: [validation_error] Upgrade to Ultra for more Cloud Agents: You've reached the limit for your current plan. Upgrade to Ultra to run more Cloud Agents simultaneously.
  • pattern-sweep:bugbot-2-solo ended with status SKIPPED: Skipped because upstream task(s) failed: root-cause:bugbot-2-solo

Root Cause

No structured root-cause result.

Pattern Sweep

No structured pattern-sweep result.

bugbot-3-solo: Plan-only triage hits GitHub

Quorum 1/2 | severity minor | category other | src/cli.ts:L410

Plan-only triage hits GitHub

Warnings:

  • root-cause:bugbot-3-solo ended with status ERROR: [validation_error] Upgrade to Ultra for more Cloud Agents: You've reached the limit for your current plan. Upgrade to Ultra to run more Cloud Agents simultaneously.
  • pattern-sweep:bugbot-3-solo ended with status SKIPPED: Skipped because upstream task(s) failed: root-cause:bugbot-3-solo

Root Cause

No structured root-cause result.

Pattern Sweep

No structured pattern-sweep result.

bugbot-4-solo: Setup fails without GitHub token

Quorum 1/2 | severity minor | category other | src/cli.ts:L538

Setup fails without GitHub token

Warnings:

  • root-cause:bugbot-4-solo ended with status ERROR: [validation_error] Upgrade to Ultra for more Cloud Agents: You've reached the limit for your current plan. Upgrade to Ultra to run more Cloud Agents simultaneously.
  • pattern-sweep:bugbot-4-solo ended with status SKIPPED: Skipped because upstream task(s) failed: root-cause:bugbot-4-solo

Root Cause

No structured root-cause result.

Pattern Sweep

No structured pattern-sweep result.

devin-2-solo: <!-- devin-review-comment {"id": "BUG_pr-review-job-acbbd5d892a8431593540d535f51

Quorum 1/2 | severity minor | category other | src/cli.ts:L583-L584

Warnings:

  • root-cause:devin-2-solo ended with status ERROR: [validation_error] Upgrade to Ultra for more Cloud Agents: You've reached the limit for your current plan. Upgrade to Ultra to run more Cloud Agents simultaneously.
  • pattern-sweep:devin-2-solo ended with status SKIPPED: Skipped because upstream task(s) failed: root-cause:devin-2-solo

Root Cause

No structured root-cause result.

Pattern Sweep

No structured pattern-sweep result.

Task Runs

  • root-cause:devin-1-solo: FINISHED agent bc-8ead65cc-24d1-4d3d-887e-0c3ff5d9c0aa run run-d82338a9-5595-447f-a5e6-a8eb1c7cb97e
  • pattern-sweep:devin-1-solo: ERROR
  • root-cause:bugbot-1-solo: ERROR
  • pattern-sweep:bugbot-1-solo: SKIPPED
  • root-cause:bugbot-2-solo: ERROR
  • pattern-sweep:bugbot-2-solo: SKIPPED
  • root-cause:bugbot-3-solo: ERROR
  • pattern-sweep:bugbot-3-solo: SKIPPED
  • root-cause:bugbot-4-solo: ERROR
  • pattern-sweep:bugbot-4-solo: SKIPPED
  • root-cause:devin-2-solo: ERROR
  • pattern-sweep:devin-2-solo: SKIPPED

Quorum exploration generated 2026-06-13T17:51:35.920Z

Quorum exploration comment for #1

Wenjix added 2 commits June 13, 2026 11:54
- dag.ts: replace invalid/deprecated Anthropic default models
  (nonexistent claude-haiku-4-20250514 and soon-retired
  claude-sonnet-4-20250514) with claude-opus-4-8 / claude-sonnet-4-6 /
  claude-haiku-4-5; switch Cursor MED/LOW defaults to composer-2.5
- cli.ts: default --concurrency to 1 for the cursor provider so runs
  don't trip Cursor's simultaneous Cloud Agent plan limit (4 for
  anthropic); an explicit --concurrency still overrides
- cursor-cloud-adapter.ts: translate the opaque "Upgrade to Ultra"
  plan-limit error into an actionable message pointing at --concurrency
- cli.ts: eval scans .quorum/runs/ recursively when --log-dir is omitted
  (the prior default defeated the recursive scan) and reports the
  correct directory when no logs are found
- anthropic.ts: drop dead parse-status ternary and now-unused import;
  parse validation is owned by the runner (applyResult)
- logging.test.ts: loadRunLogs() recurses into .quorum/runs/ when no
  logDir is given (the path the eval default-arg bug broke), returns
  empty when the dir is absent, and scans only the flat directory when
  one is passed explicitly
- cursor-cloud-adapter.test.ts: describeCursorError rewrites the
  simultaneous-agent plan-limit error into actionable guidance, passes
  unrelated errors through unchanged, and wraps non-Error values
- export describeCursorError so it can be unit-tested
Comment thread src/cli.ts
scoredDoc,
parsed,
planOnly: hasFlag(parsed, "plan-only"),
post: !hasFlag(parsed, "no-post") && !hasFlag(parsed, "dry-run"),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plan-only triage still posts

Medium Severity

quorum triage-pr --plan-only skips synthesis posting via --no-post, but phase 2 still sets post true unless --no-post or --dry-run is passed. executeExplore then upserts the exploration PR comment even though no cloud agents ran, unlike quorum explore --plan-only.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 04f3ac3. Configure here.

Drop the cursor-specific serial default; both providers now default to 4
simultaneous tasks. If a Cursor plan caps concurrent agents, the launch
still fails with the actionable describeCursorError message, and
--concurrency 1 remains available as an override.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 5 potential issues.

There are 6 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit f3dc0ad. Configure here.

Comment thread src/retry.ts Outdated
Comment thread src/adapters/anthropic.ts
max_tokens: 4096,
system: systemPrompt,
messages: [{ role: "user", content: input.prompt }],
}),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anthropic path lacks repository access

High Severity

With --provider anthropic, exploration tasks still use prompts that ask for evidence in the repository, but AnthropicAdapter only sends the stitched text prompt to the Messages API and never uses repoUrl or prUrl. Unlike Cursor Cloud, there is no repo checkout, so investigations cannot reliably inspect the PR code.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit f3dc0ad. Configure here.

Comment thread src/adapters/anthropic.ts
Comment thread src/cursor-cloud-adapter.ts Outdated
Comment thread src/cli.ts Outdated
devin-ai-integration[bot]

This comment was marked as resolved.

…ror text

Addresses Cursor Bugbot + Devin findings on f3dc0ad:
- paths.ts: resolve scripts/ via fileURLToPath (extracted from cli.ts) so
  Windows drive letters and percent-encoded paths (e.g. spaces) resolve
- retry.ts: word-boundary 5xx match so "5000" no longer triggers retries;
  backoff sleep is abort-aware and withRetry re-checks the signal after waking
- adapters/anthropic.ts: guard body.content with Array.isArray before iterating
- cursor-cloud-adapter.ts: drop stale "default is 1" from the plan-limit hint
- validate_partition.py: split-out singletons inherit the parent cluster's
  severity instead of hardcoding "minor"
- tests: paths + retry regression suites; assert no hardcoded default in the
  Cursor adapter error message
devin-ai-integration[bot]

This comment was marked as resolved.

@cursor

cursor Bot commented Jun 13, 2026

Copy link
Copy Markdown

Bugbot couldn't run - usage limit reached

Bugbot is counted against Cursor usage for this user or team, and this run hit a usage or spend limit.

A user or team admin can review and increase usage limits in the Cursor dashboard.

(requestId: serverGenReqId_8c839896-9c51-4808-82a4-bcf62b431140)

Wenjix added 2 commits June 13, 2026 15:36
- Fold 429 into the word-boundary status-code regex so number tokens like
  "4290" aren't read as a 429 (same class as the earlier 5xx/"5000" fix).
- Match "rate" only within "rate limit" (rate limit / rate_limit / ratelimit)
  so it no longer fires on words like "generate"/"moderate". A bare \brate\b
  would regress rate_limit_error (underscore is a word char), so match the phrase.
- Add a regression test for rate-limit phrasings vs lookalikes.
tsc does not prune outputs for sources removed by a branch switch, so a stale
dist/test/*.js can run against the wrong sources. A prebuild step that clears
dist/ keeps npm test honest locally and in CI.
@Wenjix Wenjix merged commit 21a6dd9 into main Jun 13, 2026
1 of 3 checks passed

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 new potential issues.

Open in Devin Review

Comment thread src/adapters/anthropic.ts
signal: input.signal,
body: JSON.stringify({
model: input.model,
max_tokens: 4096,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Hardcoded max_tokens: 4096 in Anthropic adapter

The Anthropic adapter hardcodes max_tokens: 4096 at src/adapters/anthropic.ts:49. While the Anthropic API requires this parameter, 4096 tokens may be insufficient for complex exploration tasks that need a detailed human-readable explanation plus a fenced JSON block. Truncated responses are handled gracefully (the runner records a parseError via extractMarkedJson), and the adapter logs a warning at line 76-78. However, unlike the model ID (configurable via env vars), max_tokens has no override mechanism — users hitting truncation limits would need to modify the source.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment thread src/dag.ts
Comment on lines +5 to +9
export const CURSOR_DEFAULT_MODELS: Record<Complexity, string> = {
HIGH: "gpt-5.3-codex",
MED: "composer-2",
LOW: "auto-low",
MED: "composer-2.5",
LOW: "composer-2.5",
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Default model changes: LOW complexity moved from auto-low to composer-2.5

The Cursor default models changed: MED from composer-2 to composer-2.5, and LOW from auto-low to composer-2.5 (src/dag.ts:5-8). This means LOW and MED now use the same model. Users relying on the old auto-low behavior for cost optimization on simple tasks will see different model selection unless they set QUORUM_MODEL_LOW env var. This is a behavioral change that affects existing users who don't override models.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@cursor

cursor Bot commented Jun 13, 2026

Copy link
Copy Markdown

Bugbot couldn't run - usage limit reached

Bugbot is counted against Cursor usage for this user or team, and this run hit a usage or spend limit.

A user or team admin can review and increase usage limits in the Cursor dashboard.

(requestId: serverGenReqId_0162c318-992b-44db-a944-fe6febdfc2c2)

@Wenjix Wenjix deleted the production-polish branch June 16, 2026 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant