[Platform] CI hygiene: Makefile, pre-commit, feedback bot, Definition of Done by mariagorskikh · Pull Request #12 · projnanda/nandatown

mariagorskikh · 2026-05-26T19:57:18Z

What this PR does

Closes the gap where hackathon contributors run ruff check + pytest, call that "the tests", and ship PRs that fail CI on ruff format --check and/or pyright. Five-file change, all additive — no participant code touched, no behavior change to the existing CI workflow.

Files

Makefile (new)
- make ci-local runs the exact CI command sequence in order: uv sync && uv run ruff check . && uv run ruff format --check . && uv run pyright && uv run pytest -v. One target, hard-fails on the first red command (Make's default behavior; no set -e gymnastics needed because each command is its own recipe line).
- make hooks installs pre-commit hooks via uv run --with pre-commit pre-commit install (no separate global install required).
- make help is the default goal — running bare make prints the menu.
.pre-commit-config.yaml (new)
- astral-sh/ruff-pre-commit@v0.15.14: ruff (with --fix) and ruff-format. Versions pinned to what uv sync resolves on main today.
- RobertCraigie/pyright-python@v1.1.409: invoked with pass_filenames: false so it runs the whole workspace (strict-mode type errors cross file boundaries; per-file is unreliable). Strict config picked up from [tool.pyright] in pyproject.toml.
- default_stages: [pre-commit] — hooks only fire on local commits. Auto-fix happens locally; CI never runs pre-commit (it runs the real ruff/pyright/pytest commands directly via ci.yml), so there is no path by which this config can mutate files in CI.
CONTRIBUTING.md — new "Definition of Done" section pinned to the very top, above all existing content. Lists the five exact commands every contributor MUST run before pushing in a single code block, with a one-line "why" for each (especially calling out the ruff format --check and pyright traps that bit 5/10 hackathon agents).
.github/workflows/ci-feedback.yml (new)
- Triggers on workflow_run: completed of the existing CI workflow, gated to conclusion == 'failure' AND event == 'pull_request'.
- Downloads the failing run's log zip via gh api .../actions/runs/{id}/logs, unzips, and runs an embedded Python extractor that produces ≤40-line excerpts per failing check (ruff format diff snippet, pyright error list, pytest failure summary).
- Idempotent: paginates issues/{pr}/comments, finds any comment containing the stable marker , and PATCHes it in place if found; otherwise POSTs a new one. The bot never spams — at most one comment per PR, updated on every subsequent failure.
- Permissions scoped exactly as required: pull-requests: write, actions: read, contents: read. No write to contents, no checks API, no secrets beyond GITHUB_TOKEN.
- Comment body always ends with a link to CONTRIBUTING.md#definition-of-done plus the make ci-local one-liner.
README.md — small "Before you push" callout below the hello-world block linking to CONTRIBUTING.md#definition-of-done.

Design decisions

Single Make target for the full sequence, not five sub-targets. The whole point is that contributors today cherry-pick the steps that match their mental model of "tests". A single target removes that footgun. There is intentionally no make lint / make test shortcut that could be re-introduced.
Pre-commit pyright runs on the whole workspace, not staged files. Strict-mode errors propagate across files; per-file checking gives false greens. Slightly slower commits, but matches what CI sees.
Feedback bot edits a single comment per PR keyed off an HTML marker, not actor == 'github-actions[bot]'. The marker is stable across renames and survives someone hand-editing the body. It is also unambiguous in body | contains(...) jq filters.
The extractor is a single embedded Python heredoc, not a separate script file. Keeps the workflow self-contained — no risk of a participant's PR breaking the bot by modifying a shared script.
CI sequence in Makefile matches ci.yml line-for-line. When ci.yml changes, both files MUST be updated together; this is intentional friction so the local and remote sequences cannot drift silently.

How to test the feedback bot

Once this PR is merged, open a follow-up test PR that deliberately violates one (or all) of the three categories — for example:

git checkout -b test/ci-feedback-bot
# Introduce a format-only violation:
python -c "open('packages/nest-core/nest_core/__init__.py','a').write('\n\n\n\n  badly_formatted   = 1\n')"
git commit -am "test: trigger ci-feedback bot with a format violation"
git push -u origin test/ci-feedback-bot
# Open PR; CI will fail on `ruff format --check`; ci-feedback workflow should
# post one comment with the format diff excerpt and the reproduce command.
# Then add a pyright error in a second commit and confirm the SAME comment
# is edited in place (no second comment appears).

Expected behavior:

One PR comment appears within ~30s of CI going red.
Comment contains the ruff-format diff snippet, the reproduction command, and a link to CONTRIBUTING.md#definition-of-done.
On a follow-up failing push, the comment is edited in place — no new comment is added.
On a green push, no new comment is added and the stale failure comment is left as-is (out of scope to delete; it remains as a historical record).

Verification

Ran make ci-local on this branch before pushing. All five steps exit 0:

>>> [1/5] uv sync                       # ok
>>> [2/5] uv run ruff check .           # All checks passed!
>>> [3/5] uv run ruff format --check .  # 94 files already formatted
>>> [4/5] uv run pyright                # 0 errors, 0 warnings, 0 informations
>>> [5/5] uv run pytest -v              # 259 passed, 1 warning in 13.97s
ci-local: all 5 checks passed. Safe to push.

Practicing what we preach.

Generated by Claude Code

…n of Done Closes the gap where contributors run "ruff check" + pytest, call that "the tests", and ship PRs that fail CI on ruff format --check and/or pyright. - Makefile: `ci-local` runs the exact CI sequence (uv sync, ruff check, ruff format --check, pyright, pytest -v) and hard-fails on the first red command. `hooks` installs pre-commit. `help` is the default goal. - .pre-commit-config.yaml: ruff-check + ruff-format (auto-fix locally) and pyright in strict mode (versions pinned to what `uv sync` resolves). - CONTRIBUTING.md: Definition of Done section at the very top with the five required pre-push commands and a one-line rationale for each. - README.md: "Before you push" callout pointing at `make ci-local` and the Definition of Done. - .github/workflows/ci-feedback.yml: triggers on the existing CI workflow's failure, downloads logs, extracts per-check excerpts (ruff format diff, pyright errors, pytest summary; ~40 lines each), and posts/edits a single PR comment keyed off the marker ``. Permissions scoped to pull-requests:write, actions:read, contents:read. Verified locally: `make ci-local` exits 0 on this branch (5/5 green).

sourcery-ai

Sorry @mariagorskikh, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

Integration of 5 platform tracks built in parallel by specialist agents: - platform/ci-hygiene (PR #12): Makefile + pre-commit + idempotent CI feedback bot + CONTRIBUTING Definition of Done - platform/open-problems (PR #13): 10 differentiated open problems across 10 layers, charter, judging doc - platform/judge-panel (PR #14): rubric, anthropic + openai providers, run_all CLI, real-diff fixture, live gpt-5.5 scoreboard for PRs #2-#11 - platform/research-harness (PR #15): conditions matrix, claude-CLI live runner, collect + analyze, dry-run fixtures + tests - platform/marketplace-ui (PR #16): /hackathon Next.js section with author tags, judge scores, layer browser; Python data adapter Schema reconciled end-to-end (rubric -> scores.json -> adapter -> TS types -> UI) on the 6-dim 1-5 scale with totals in [6, 30]. Local CI: 341 passed, 1 skipped (matplotlib gated), 1 deselected (live marker). Live judge scoreboard top: #2 harvard-phd trust 26.0/30 (EigenTrust + checkable invariants) #7 coinbase-crypto payments 26.0/30 (HTLC escrow) #6 stanford-ml-phd trust 25.0/30 #11 google-staff transport 25.0/30

mariagorskikh · 2026-05-26T22:06:48Z

Superseded by #17 (now merged to main at 1771cdb). Closing — the content of this PR is part of that integration merge.

Generated by Claude Code

sourcery-ai Bot reviewed May 26, 2026

View reviewed changes

mariagorskikh mentioned this pull request May 26, 2026

[Platform] Integration v2 #17

Merged

mariagorskikh merged commit 14e59ed into main May 26, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Platform] CI hygiene: Makefile, pre-commit, feedback bot, Definition of Done#12

[Platform] CI hygiene: Makefile, pre-commit, feedback bot, Definition of Done#12
mariagorskikh merged 1 commit into
mainfrom
platform/ci-hygiene

mariagorskikh commented May 26, 2026

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

Uh oh!

mariagorskikh commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mariagorskikh commented May 26, 2026

What this PR does

Files

Design decisions

How to test the feedback bot

Verification

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mariagorskikh commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants