Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
419447f
plan: gbrain integration for apra-fleet
yashrajsapra May 12, 2026
6d8e9e7
review: gbrain integration plan — CHANGES NEEDED
yashrajsapra May 12, 2026
a5d21d5
fix(plan): correct gbrain tool names, use underscores, expand Minions…
yashrajsapra May 12, 2026
eab88d0
fix(plan): address reviewer feedback — tool names, template condition…
yashrajsapra May 12, 2026
75e4f57
fix(plan): annotate feedback-gbrain.md with commit SHAs
yashrajsapra May 12, 2026
292c9c4
review: gbrain integration plan re-review — CHANGES NEEDED (1 remaining)
yashrajsapra May 12, 2026
23c17b7
review: gbrain integration plan re-review — CHANGES NEEDED (1 remaining)
yashrajsapra May 12, 2026
7ea0491
fix(plan): promote Task 1.4 to premium tier — fix Phase 1 tier monoto…
yashrajsapra May 12, 2026
2e246e2
fix(plan): annotate Finding 5 with commit SHA f29375c
yashrajsapra May 12, 2026
6c325c6
fix(plan): promote Task 1.4 to premium tier for monotonicity
yashrajsapra May 12, 2026
9ca9a98
feat(types): add gbrain field to Agent interface (T1.1)
yashrajsapra May 12, 2026
c03e501
feat(tools): add gbrain to register/update/list/detail tools (T1.2)
yashrajsapra May 12, 2026
55387fa
fix(plan): address findings 2-5 from reviewer feedback
yashrajsapra May 12, 2026
342ba68
feat(gbrain): add gbrain MCP client service (T1.3)
yashrajsapra May 12, 2026
ce8fb08
test(gbrain): add Phase 1 tests for gbrain client and config (T1.4)
yashrajsapra May 12, 2026
4870ccc
review(gbrain): Phase 1 code review — CHANGES NEEDED
yashrajsapra May 12, 2026
bc85296
fix(tests): add listMembers and memberDetail gbrain display tests (T1.4)
yashrajsapra May 13, 2026
e663a17
feat(gbrain): add shared gbrain helpers assertGbrainEnabled and callG…
yashrajsapra May 13, 2026
f7b7d82
feat(gbrain): add brain_query and brain_write fleet tools (T2.1, T2.2)
yashrajsapra May 13, 2026
2977df5
feat(gbrain): add brain-tools tests and update progress (T2.3)
yashrajsapra May 13, 2026
447097c
review(gbrain): Phase 2 code review — APPROVED
yashrajsapra May 13, 2026
13c49b3
feat(gbrain): add code analysis tools and tests (T3.1, T3.2)
yashrajsapra May 13, 2026
48667e9
review(gbrain): Phase 3 code review — APPROVED
yashrajsapra May 13, 2026
232b3be
feat(gbrain): add Minions job queue tools and tests (T4.1, T4.2)
yashrajsapra May 13, 2026
43a92e5
review(gbrain): Phase 4 code review — APPROVED
yashrajsapra May 13, 2026
bf3bcff
docs(gbrain): add brain-aware review section to reviewer template (T5.1)
yashrajsapra May 13, 2026
f9f3e0a
feat(gbrain): add course correction capture service (T5.2)
yashrajsapra May 13, 2026
e441ae9
feat(gbrain): add course_correction_capture and course_correction_rec…
yashrajsapra May 13, 2026
b271862
docs(gbrain): document course_correction_capture call-sites in PM ski…
yashrajsapra May 13, 2026
f837599
test(gbrain): add course correction tests (T5.5)
yashrajsapra May 13, 2026
b7def46
review(gbrain): Phase 5 code review — APPROVED
yashrajsapra May 13, 2026
61b9cd8
chore(gbrain): DRY audit — ensure all tools use shared helpers (T6.1)
yashrajsapra May 13, 2026
cb3ebd7
feat(gbrain): wire gbrain lifecycle into server startup/shutdown (T6.2)
yashrajsapra May 13, 2026
c8fd4b8
docs(gbrain): add gbrain integration section to README (T6.3)
yashrajsapra May 13, 2026
dc66406
test(gbrain): add final integration tests (T6.4)
yashrajsapra May 13, 2026
40da0ad
test(gbrain): add gbrain vs no-gbrain comparative test (T6.5)
yashrajsapra May 13, 2026
2e6d266
chore: mark Phase 6 verified (T6.1-T6.5 complete)
yashrajsapra May 13, 2026
b333e3c
review(gbrain): Phase 6 final review — APPROVED
yashrajsapra May 13, 2026
9546d50
review(gbrain): independent Phase 6 verification — APPROVED
yashrajsapra May 13, 2026
6855b7b
cleanup:
yashrajsapra May 13, 2026
c859b6e
cleanup:
yashrajsapra May 13, 2026
456c44a
Merge 9546d508c5ee40020e4a7e833b988f80b4282062 into 1970ced49267fa5f3…
yashrajsapra May 13, 2026
d4eadb6
chore: regenerate llms-full.txt
github-actions[bot] May 13, 2026
d94ca8f
cleanup:
yashrajsapra May 13, 2026
f4f2631
review(gbrain): Phase 1 code re-review — APPROVED
yashrajsapra May 13, 2026
c806c88
Merge branch 'feat/gbrain-integration' of https://github.com/Apra-Lab…
yashrajsapra May 13, 2026
72145ed
feat(install): add --with-gbrain flag to install gbrain alongside fleet
yashrajsapra May 13, 2026
687d986
ci(gbrain): add gbrain BM25 recall eval workflow
yashrajsapra May 14, 2026
5e10a20
fix(gbrain): correct MCP server start command from 'mcp' to 'serve'
yashrajsapra May 14, 2026
09e693a
fix(gbrain): remap all fleet tools to correct gbrain MCP tool names
yashrajsapra May 14, 2026
9128602
fix(ci): update gbrain eval to use correct tool names put_page + search
yashrajsapra May 14, 2026
aea7a34
ci(gbrain): improve eval — debug seed output, dual search/query fallback
yashrajsapra May 14, 2026
9f635a4
ci(gbrain): pivot eval to put_page→get_page persistence roundtrip
yashrajsapra May 14, 2026
d1c9383
chore(ci): add fleet-e2e-compat.yml — copy of e2e with v? version reg…
yashrajsapra May 14, 2026
21d6e7c
fix(ci): make version regex accept both v0.1.x and 0.1.x formats
yashrajsapra May 14, 2026
5c678e1
fix(gbrain): lazy-load MCP client SDK to prevent startup crash on Lin…
yashrajsapra May 14, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
178 changes: 178 additions & 0 deletions .github/eval/gbrain-eval.mjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
/**
* gbrain Knowledge Persistence Eval
*
* Writes 5 apra-fleet facts to gbrain (PGLite — zero external deps),
* reads them back by slug, and verifies the content is intact.
*
* This proves:
* 1. `apra-fleet install --with-gbrain` produces a working gbrain install
* 2. gbrain persists knowledge durably in PGLite (no API key, no server)
* 3. Knowledge is faithfully retrievable (5/5 roundtrip)
*
* Exit 0 = PASS (5/5 roundtrip), Exit 1 = FAIL.
* Writes a Markdown scorecard to $GITHUB_STEP_SUMMARY when running in CI.
*/

import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js';
import fs from 'fs';

// ---------------------------------------------------------------------------
// Test dataset — 5 apra-fleet facts
// ---------------------------------------------------------------------------
const FACTS = [
{
id: 'port',
content: 'The apra-fleet MCP server listens on port 3000 by default.',
keywords: ['port 3000', '3000'],
},
{
id: 'ssh-remote',
content: 'Fleet members can be local agents or SSH remote machines registered with a hostname and username.',
keywords: ['SSH remote', 'hostname'],
},
{
id: 'execute-prompt',
content: 'The execute_prompt tool dispatches a task to a Claude Code agent and waits for its response.',
keywords: ['execute_prompt', 'Claude Code'],
},
{
id: 'pglite',
content: 'gbrain uses PGLite for local storage — no external database server is required when running in local mode.',
keywords: ['PGLite', 'no external database'],
},
{
id: 'reviewer',
content: 'The fleet reviewer template checks code for security vulnerabilities and test coverage before approving.',
keywords: ['security vulnerabilities', 'test coverage'],
},
];

// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
function extractText(result) {
if (!result || !result.content) return '';
return result.content
.filter(c => c.type === 'text')
.map(c => c.text)
.join('\n');
}

function extractJson(text) {
try { return JSON.parse(text); } catch { return null; }
}

function verifyContent(responseText, fact) {
const parsed = extractJson(responseText);
// get_page returns JSON with compiled_truth or slug fields
const candidate = parsed
? JSON.stringify(parsed).toLowerCase()
: responseText.toLowerCase();
return fact.keywords.some(kw => candidate.includes(kw.toLowerCase()));
}

// ---------------------------------------------------------------------------
// Main
// ---------------------------------------------------------------------------
async function main() {
const gbrain = process.env.GBRAIN_CMD || 'gbrain';

const transport = new StdioClientTransport({
command: gbrain,
args: ['serve'],
env: {
...process.env,
PATH: `${process.env.HOME}/.bun/bin:${process.env.PATH || ''}`,
},
});

const client = new Client({ name: 'gbrain-eval', version: '1.0.0' }, { capabilities: {} });

console.log('Connecting to gbrain MCP server...');
await client.connect(transport);

// Print server identity
try {
const identity = await client.callTool({ name: 'get_brain_identity', arguments: {} });
console.log(`Connected: ${extractText(identity).slice(0, 120)}\n`);
} catch {
console.log('Connected.\n');
}

// -- Seed ------------------------------------------------------------------
console.log('=== Writing facts (put_page) ===');
const writeResults = [];
for (const fact of FACTS) {
const result = await client.callTool({
name: 'put_page',
arguments: {
slug: `eval/${fact.id}`,
content: `---\ntags: [eval, apra-fleet]\n---\n${fact.content}`,
},
});
const text = extractText(result);
const parsed = extractJson(text);
const status = parsed?.status ?? text.slice(0, 40);
const ok = text.includes('created') || text.includes('updated');
writeResults.push({ id: fact.id, ok, status });
console.log(` [${ok ? 'OK ' : 'FAIL'}] ${fact.id}: ${status}`);
}

// -- Read back -------------------------------------------------------------
console.log('\n=== Reading facts back (get_page) ===');
const rows = [];

for (const fact of FACTS) {
const result = await client.callTool({
name: 'get_page',
arguments: { slug: `eval/${fact.id}` },
});
const text = extractText(result);
const match = verifyContent(text, fact);
rows.push({ id: fact.id, match, snippet: text.slice(0, 120).replace(/\n/g, ' ') });
console.log(` [${match ? 'MATCH' : 'MISS '}] ${fact.id}`);
if (!match) console.log(` response: ${text.slice(0, 120)}`);
}

await client.close();

// -- Score -----------------------------------------------------------------
const hits = rows.filter(r => r.match).length;
const total = rows.length;
const pct = Math.round((hits / total) * 100);
const pass = hits === total; // 5/5 required for persistence eval

// -- Report ----------------------------------------------------------------
const lines = [
'## gbrain Knowledge Persistence Eval',
'',
`**Score: ${hits}/${total} (${pct}%) — ${pass ? '✅ PASS' : '❌ FAIL'}**`,
'',
'| Fact | Content slug | Stored + Retrieved |',
'|------|-------------|-------------------|',
...rows.map(r => `| \`${r.id}\` | \`eval/${r.id}\` | ${r.match ? '✅ OK' : '❌ FAIL'} |`),
'',
'### What this demonstrates',
'- `apra-fleet install --with-gbrain` produces a working gbrain install',
'- gbrain persists knowledge in **PGLite** — zero external deps, no API key',
'- Knowledge is faithfully retrieved by slug (deterministic roundtrip)',
`- Fleet agents with \`gbrain: true\` get persistent memory across sessions`,
];

const report = lines.join('\n');
console.log('\n' + report);

const summaryFile = process.env.GITHUB_STEP_SUMMARY;
if (summaryFile) {
fs.appendFileSync(summaryFile, report + '\n');
console.log('\nScorecard written to step summary.');
}

process.exit(pass ? 0 : 1);
}

main().catch(err => {
console.error('Eval error:', err.message || err);
process.exit(1);
});
Loading
Loading