[Platform] CI hygiene: Makefile, pre-commit, feedback bot, Definition of Done#12
Merged
Conversation
…n of Done Closes the gap where contributors run "ruff check" + pytest, call that "the tests", and ship PRs that fail CI on ruff format --check and/or pyright. - Makefile: `ci-local` runs the exact CI sequence (uv sync, ruff check, ruff format --check, pyright, pytest -v) and hard-fails on the first red command. `hooks` installs pre-commit. `help` is the default goal. - .pre-commit-config.yaml: ruff-check + ruff-format (auto-fix locally) and pyright in strict mode (versions pinned to what `uv sync` resolves). - CONTRIBUTING.md: Definition of Done section at the very top with the five required pre-push commands and a one-line rationale for each. - README.md: "Before you push" callout pointing at `make ci-local` and the Definition of Done. - .github/workflows/ci-feedback.yml: triggers on the existing CI workflow's failure, downloads logs, extracts per-check excerpts (ruff format diff, pyright errors, pytest summary; ~40 lines each), and posts/edits a single PR comment keyed off the marker `<!-- ci-feedback-bot -->`. Permissions scoped to pull-requests:write, actions:read, contents:read. Verified locally: `make ci-local` exits 0 on this branch (5/5 green).
There was a problem hiding this comment.
Sorry @mariagorskikh, you have reached your weekly rate limit of 500000 diff characters.
Please try again later or upgrade to continue using Sourcery
mariagorskikh
added a commit
that referenced
this pull request
May 26, 2026
Integration of 5 platform tracks built in parallel by specialist agents: - platform/ci-hygiene (PR #12): Makefile + pre-commit + idempotent CI feedback bot + CONTRIBUTING Definition of Done - platform/open-problems (PR #13): 10 differentiated open problems across 10 layers, charter, judging doc - platform/judge-panel (PR #14): rubric, anthropic + openai providers, run_all CLI, real-diff fixture, live gpt-5.5 scoreboard for PRs #2-#11 - platform/research-harness (PR #15): conditions matrix, claude-CLI live runner, collect + analyze, dry-run fixtures + tests - platform/marketplace-ui (PR #16): /hackathon Next.js section with author tags, judge scores, layer browser; Python data adapter Schema reconciled end-to-end (rubric -> scores.json -> adapter -> TS types -> UI) on the 6-dim 1-5 scale with totals in [6, 30]. Local CI: 341 passed, 1 skipped (matplotlib gated), 1 deselected (live marker). Live judge scoreboard top: #2 harvard-phd trust 26.0/30 (EigenTrust + checkable invariants) #7 coinbase-crypto payments 26.0/30 (HTLC escrow) #6 stanford-ml-phd trust 25.0/30 #11 google-staff transport 25.0/30
Collaborator
Author
|
Superseded by #17 (now merged to main at Generated by Claude Code |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does
Closes the gap where hackathon contributors run
ruff check+pytest, call that "the tests", and ship PRs that fail CI onruff format --checkand/orpyright. Five-file change, all additive — no participant code touched, no behavior change to the existing CI workflow.Files
Makefile(new)make ci-localruns the exact CI command sequence in order:uv sync && uv run ruff check . && uv run ruff format --check . && uv run pyright && uv run pytest -v. One target, hard-fails on the first red command (Make's default behavior; noset -egymnastics needed because each command is its own recipe line).make hooksinstalls pre-commit hooks viauv run --with pre-commit pre-commit install(no separate global install required).make helpis the default goal — running baremakeprints the menu..pre-commit-config.yaml(new)astral-sh/ruff-pre-commit@v0.15.14:ruff(with--fix) andruff-format. Versions pinned to whatuv syncresolves onmaintoday.RobertCraigie/pyright-python@v1.1.409: invoked withpass_filenames: falseso it runs the whole workspace (strict-mode type errors cross file boundaries; per-file is unreliable). Strict config picked up from[tool.pyright]inpyproject.toml.default_stages: [pre-commit]— hooks only fire on local commits. Auto-fix happens locally; CI never runspre-commit(it runs the realruff/pyright/pytestcommands directly viaci.yml), so there is no path by which this config can mutate files in CI.CONTRIBUTING.md— new "Definition of Done" section pinned to the very top, above all existing content. Lists the five exact commands every contributor MUST run before pushing in a single code block, with a one-line "why" for each (especially calling out theruff format --checkandpyrighttraps that bit 5/10 hackathon agents)..github/workflows/ci-feedback.yml(new)workflow_run: completedof the existingCIworkflow, gated toconclusion == 'failure'ANDevent == 'pull_request'.gh api .../actions/runs/{id}/logs, unzips, and runs an embedded Python extractor that produces ≤40-line excerpts per failing check (ruff format diff snippet, pyright error list, pytest failure summary).issues/{pr}/comments, finds any comment containing the stable marker<!-- ci-feedback-bot -->, and PATCHes it in place if found; otherwise POSTs a new one. The bot never spams — at most one comment per PR, updated on every subsequent failure.pull-requests: write,actions: read,contents: read. No write to contents, no checks API, no secrets beyondGITHUB_TOKEN.CONTRIBUTING.md#definition-of-doneplus themake ci-localone-liner.README.md— small "Before you push" callout below the hello-world block linking toCONTRIBUTING.md#definition-of-done.Design decisions
make lint/make testshortcut that could be re-introduced.actor == 'github-actions[bot]'. The marker is stable across renames and survives someone hand-editing the body. It is also unambiguous inbody | contains(...)jq filters.Makefilematchesci.ymlline-for-line. Whenci.ymlchanges, both files MUST be updated together; this is intentional friction so the local and remote sequences cannot drift silently.How to test the feedback bot
Once this PR is merged, open a follow-up test PR that deliberately violates one (or all) of the three categories — for example:
Expected behavior:
CIgoing red.CONTRIBUTING.md#definition-of-done.Verification
Ran
make ci-localon this branch before pushing. All five steps exit 0:Practicing what we preach.
Generated by Claude Code