Skip to content

Fix AI classification workflow using legacy prompt/simple inference with no grounded validation #71

Description

@primeinc

Goal

Fix the AI classification workflow so repo classification is grounded, validated, and testable instead of letting actions/ai-inference@v2 free-associate JSON under legacy prompt mode with no tools.

Parent: #42
Related: #46, #48, #50, #53, #54, #62, #69

Triggering Evidence

Direct evidence from the supplied workflow log at 2026-05-10T07:25:19Z:

Run actions/ai-inference@v2
model: openai/gpt-4o
max-tokens: 3000
system-prompt-file: .github-stars/data/system-prompt.txt
prompt-file: .github-stars/data/user-prompt.txt
endpoint: https://models.github.ai/inference
system-prompt: You are a helpful assistant
enable-github-mcp: false
BATCH_LIMIT: 15
Using legacy prompt format
Running simple inference without tools

The model then returned a raw JSON array of classifications, for example:

[
  {"repo":"streetwriters/notesnook","categories":["productivity","desktop-dev"],"tags":["note-taking","note-management","lang:ts"],"framework":null},
  {"repo":"vercel-labs/skills","categories":["ai-ml","dev-tools"],"tags":["ai-agent","skills-tool","lang:ts"],"framework":null},
  {"repo":"JamieMason/syncpack","categories":["productivity","dev-tools"],"tags":["dependency-management","monorepo","lang:rust"],"framework":null},
  {"repo":"cursor/agent-trace","categories":["ai-ml","documentation"],"tags":["ai-code-tracing","standard-format","lang:ts"],"framework":null}
]

Problem

The classification stage is currently accepting ungrounded model output from a legacy/simple inference path.

Observed failure shape:

legacy prompt format
  -> generic fallback system prompt visible in action inputs
  -> no GitHub MCP/tools
  -> no per-repo grounded evidence fetch
  -> raw model JSON returned
  -> taxonomy/schema gates may catch shape, but not attribution truth

This can produce plausible-looking classifications with no proof that the model read the repository, package metadata, README, topics, language stats, or any canonical source.

The visible example includes at least one suspicious classification candidate: JamieMason/syncpack is tagged lang:rust in the returned model output. That may be wrong or stale, but this issue does not need to prove that specific repo's language to prove the workflow defect. The defect is that the classifier can emit language/category/framework claims without direct evidence.

Required Architecture

Do not rely on raw LLM output as classification truth.

Classification must become:

candidate repos
  -> evidence collection
  -> typed classification prompt/input
  -> model candidate classification
  -> typed parse
  -> schema validation
  -> taxonomy validation
  -> evidence validation / confidence scoring
  -> bounded write to repos.yml
  -> workflow summary with proof

Required Changes

1. Replace legacy/simple inference mode

The workflow must stop using a path that reports:

Using legacy prompt format
Running simple inference without tools

If actions/ai-inference@v2 remains, configure it so the prompt contract is explicit and current. If the action cannot support the needed grounding/validation, move classification into TypeScript and call the model through a typed adapter.

2. Add evidence-backed classification input

For each repo in the batch, capture and pass grounded fields such as:

repo
html_url
description
primary language
repository topics
stargazer count
fork count
archived/fork/private flags
README excerpt if available
package manifests if available / feasible
existing categories/tags/framework
last updated / pushed timestamp
source fields used for classification

Do not classify from repo name alone unless the output marks low confidence / needs review.

3. Add typed parser and validator

Model output must pass a TypeScript parser before it can mutate repos.yml.

Required checks:

valid JSON
array length matches batch or unmatched repos are explicit
repo names match requested batch
categories are in canonical taxonomy
framework is null or allowed
language tags match collected language evidence or are flagged needs_review
tags are normalized and bounded
unknown/unsupported claims are rejected or marked needs_review

4. Add confidence / evidence status

Each classification result should carry internal evidence status before write:

Direct evidence: supported by collected repo metadata or source field
Weak inference: plausible from description/topics/name but not proven
Unsupported: not grounded in available input; do not write silently

repos.yml may not need to store all of this permanently, but the workflow summary/artifact must expose enough proof for review.

5. Add workflow summary diagnostics

The classification workflow summary must report:

model
inference mode
tools enabled/disabled
batch size
repos classified
repos rejected
repos marked needs_review
schema validation status
taxonomy validation status
evidence validation status
sample rejected reason
artifact paths
commit SHA / no-change status

6. Add regression fixtures

Add fixtures/tests for model-output failure modes:

wrong repo returned
extra repo returned
missing repo returned
invalid JSON
unknown category
unknown framework
language tag contradicted by collected metadata
unsupported tag with no evidence
legacy prompt output accepted without evidence

This must land under the TypeScript control-plane direction from #69, not as one more YAML blob with hopes and dreams zip-tied to it.

Acceptance Criteria

  • Classification no longer runs through ungrounded legacy prompt format / simple inference without tools as the accepted production path.
  • Every model classification is parsed by TypeScript before mutation.
  • Classification output is matched back to the requested repo batch.
  • Schema validation and taxonomy validation remain hard gates.
  • Evidence validation exists for language/category/framework/tag claims.
  • Unsupported or contradictory classification claims are rejected or marked needs_review.
  • Workflow summary reports model, inference mode, tools/grounding status, counts, rejected items, and validation results.
  • Tests cover malformed, mismatched, unsupported, and contradicted model outputs.
  • AGENTS.md documents that raw model JSON is candidate classification, not truth.
  • Final acceptance 02L: Run Final Bulletproof Acceptance Validation #54 can cite a successful classification run with evidence-rich summary output.

Proof Required

Completion comment must include:

  • Workflow diff and TypeScript parser/validator diff.
  • Test output for classification parser/validator fixtures.
  • Successful workflow run URL.
  • Workflow summary excerpt showing inference mode, validation counts, and rejected/needs-review counts.
  • Example of at least one rejected or needs-review classification from a fixture or test.
  • Confirmation that repos.yml is not mutated by unvalidated raw model output.

Evidence Labels for Implementer

Use these labels in the completion report:

  • Direct evidence: workflow log, source diff, parser/test code, test output, workflow summary, artifact, commit SHA.
  • Weak inference: a classification is plausible based on description/topics but not directly proven by source metadata.
  • Unsupported: model output claims a language/category/framework/tag with no supporting input evidence.
  • Contradicted: model output conflicts with collected metadata or canonical taxonomy.
  • Blocked: model/action cannot provide current inference mode, repo metadata unavailable, or grounding source cannot be fetched.

Non-Goals

Definition of Done

The classifier treats model output as untrusted candidate data, validates it through typed code, binds classifications to collected evidence, and refuses to mutate repos.yml from legacy/simple ungrounded inference output.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions