Skip to content

feat: source management REST API — register, sync, delete + tests (tasks 6.1-6.4)#7

Merged
johnmcollier merged 5 commits into
redhat-ai-dev:mainfrom
johnmcollier:feat/epic6-source-management
Jun 19, 2026
Merged

feat: source management REST API — register, sync, delete + tests (tasks 6.1-6.4)#7
johnmcollier merged 5 commits into
redhat-ai-dev:mainfrom
johnmcollier:feat/epic6-source-management

Conversation

@johnmcollier

Copy link
Copy Markdown
Contributor

Summary

Implements the Source Management API in src/server/routes/sources.ts.

  • POST /api/v1/sources — validates slug (kebab-case, 1–64 chars); 409 on duplicate; clones before writing to DB so clone failure never leaves an orphaned source record; 422 CLONE_FAILED on bad URL; 201 with source + sync report on success
  • DELETE /api/v1/sources/:id — removes source + all associated skills (FK CASCADE); 404 on unknown
  • POST /api/v1/sources/:id/sync — 409 SYNC_IN_PROGRESS on concurrent; returns sync report; 404 on unknown

No auth on write routes for now — Epic 3 will gate them.

Design note — clone-before-create

The registration flow clones first, only persisting the source record after a successful clone. This strictly satisfies the spec's "no source record is created" guarantee on clone failure.

Test plan

  • npm run build:server — zero TypeScript errors
  • npm test — 74/74 passing; 11 new sources route tests

Related

  • OpenSpec tasks: 6.1, 6.2, 6.3, 6.4
  • Jira: RHIDP-14978

Made with Cursor

johnmcollier and others added 3 commits June 19, 2026 16:22
Add POST /api/v1/sources, DELETE /api/v1/sources/:id, and POST
/api/v1/sources/:id/sync endpoints with full validation, error handling,
atomic cleanup on clone failure, concurrent sync guard, and 11 API tests.

Co-authored-by: Cursor <cursoragent@cursor.com>
…ks 5.1-5.3, 5.5-5.6)

Add GET /api/v1/skills (paginated listing), GET /api/v1/skills/search
(Fuse.js fuzzy search), and GET /api/v1/skills/:source/:slug (full skill
detail) endpoints. Implement FuseSearchProvider backed by Fuse.js with
index rebuild wired into the server. Add global Fastify error handler
returning structured {error: {code, message}} for all 4xx/5xx. Guard
top-level buildServer() call so index.ts is safe to import in tests.
Add 17 API tests covering pagination bounds, 404 on unknown skill, and
fuzzy search. Mark tasks 5.1-5.3, 5.5-5.7 complete.

Co-authored-by: Cursor <cursoragent@cursor.com>
…sponse

- POST /api/v1/sources now clones before writing to DB so a clone failure
  never creates an orphaned source record (spec: 'no source record is created')
- Response field renamed createdAt → created_at to match spec contract

Co-authored-by: Cursor <cursoragent@cursor.com>
@qodo-code-review

Copy link
Copy Markdown

PR Summary by Qodo

Implement sources CRUD/sync API, skills catalog endpoints, and Fuse.js search
✨ Enhancement 🧪 Tests 📝 Documentation 🕐 40+ Minutes

Grey Divider

Description

• Add REST endpoints for sources register/delete/sync with validation and concurrency guards.
• Add skills catalog read endpoints and Fuse.js-backed search with index rebuild on sync.
• Introduce global Fastify error envelope and expand route-level API test coverage.
Diagram

graph TD
  C((Client)) --> A["Fastify server"] --> SR["routes/sources.ts"] --> ING["ingestion/ingest.ts"] --> G{{"Git repo"}}
  ING --> R[("SQLite repos")]
  A --> KR["routes/skills.ts"] --> R --> SP["FuseSearchProvider"]
  SR --> SP
  subgraph Legend
    direction LR
    _cli((Client)) ~~~ _svc["Service/module"] ~~~ _db[(Database)] ~~~ _ext{{External}}
  end
Loading
High-Level Assessment

The following are alternative approaches to this PR:

1. Incremental search index updates per source
  • ➕ Avoids O(N) full index rebuild on every sync/register/delete
  • ➕ Enables better performance as catalog grows
  • ➖ More complexity: need per-source diffing and delete/update semantics
  • ➖ Harder to reason about correctness vs rebuild-from-DB
2. Async ingestion (job queue) with eventual consistency
  • ➕ Registration/sync endpoints return quickly; work happens in background
  • ➕ Better resilience/retry for clone/ingestion failures
  • ➖ Spec may require synchronous syncReport response semantics
  • ➖ Requires job runner, persistence for job state, and additional operational complexity
3. DB transaction with 'pending' source state (instead of clone-before-create)
  • ➕ Keeps a durable audit trail of attempted registrations
  • ➕ Allows tracking failures without losing request context
  • ➖ Violates/complicates the spec guarantee of 'no source record is created' on clone failure unless explicitly excluded
  • ➖ Requires schema changes and lifecycle handling

Recommendation: The current clone-before-create flow is the best fit for the stated spec guarantee and keeps failure handling straightforward (no orphan sources on clone failure). The main future improvement to consider is moving from full search index rebuilds to incremental updates once the skills catalog is large; for now, rebuilding from DB is simpler and safer.

Files changed (9) +797 / -16

Enhancement (6) +377 / -6
SqliteSkillRepository.tsAdd updated-at sorting for skill listings +2/-1

Add updated-at sorting for skill listings

• Introduces an "updatedAt" sort option and maps it to SQL ordering by updated_at DESC. Extends the repository list API to accept the new sort variant.

src/server/db/SqliteSkillRepository.ts

types.tsExtend SkillRepository findAll sort options +1/-1

Extend SkillRepository findAll sort options

• Updates the SkillRepository interface to allow sorting by updatedAt in addition to name and createdAt.

src/server/db/types.ts

index.tsRegister skills/sources plugins, Fuse.js index, and global error handler +50/-4

Register skills/sources plugins, Fuse.js index, and global error handler

• Adds a global Fastify error handler that returns a consistent {error:{code,message}} envelope for 4xx/5xx responses. Builds a Fuse.js search index from DB skills at startup, registers the new skills and sources route plugins, and returns {app, searchProvider}; also guards top-level listen() so index.ts can be imported in tests.

src/server/index.ts

skills.tsImplement skills catalog read and search endpoints +135/-0

Implement skills catalog read and search endpoints

• Adds GET / (paginated listing with validation), GET /search (Fuse.js-backed fuzzy search), and GET /:source/:slug (skill detail including files array). Ensures /search is registered before the dynamic route to avoid path conflicts.

src/server/routes/skills.ts

sources.tsImplement source register/delete/sync endpoints with clone-first semantics +162/-0

Implement source register/delete/sync endpoints with clone-first semantics

• Adds POST / (validates slug + URL, rejects duplicates, clones to temp dir before DB create, ingests and returns sync report), DELETE /:id (validates id, deletes source; relies on FK CASCADE for skills), and POST /:id/sync (sync-in-progress guard, status updates, ingestion, and optional search index rebuild). Includes index rebuild helper and consistent error codes (e.g., INVALID_SLUG, SLUG_CONFLICT, SYNC_IN_PROGRESS, SOURCE_NOT_FOUND).

src/server/routes/sources.ts

FuseSearchProvider.tsAdd Fuse.js-backed SearchProvider implementation +27/-0

Add Fuse.js-backed SearchProvider implementation

• Implements SearchProvider using Fuse.js with configured keys, scoring, and threshold. Supports full index rebuild and query search with optional result limiting.

src/server/search/FuseSearchProvider.ts

Tests (2) +410 / -0
skills.test.tsAdd API tests for skills listing, detail, and search +212/-0

Add API tests for skills listing, detail, and search

• Creates an in-memory SQLite-backed Fastify test server to validate pagination bounds, sort validation, 404 behavior for unknown skills, and Fuse.js fuzzy search behavior. Also asserts that /search is not shadowed by the dynamic /:source/:slug route.

test/server/routes/skills.test.ts

sources.test.tsAdd API tests for source register/delete/sync behaviors +198/-0

Add API tests for source register/delete/sync behaviors

• Adds tests covering invalid slug, duplicate slug conflict, clone failure without orphan DB records, delete unknown/existing source (including FK cascade), and sync concurrency/unknown source/id validation cases.

test/server/routes/sources.test.ts

Documentation (1) +10 / -10
tasks.mdMark Skills Catalog and Source Management tasks complete +10/-10

Mark Skills Catalog and Source Management tasks complete

• Checks off tasks 5.1–5.3, 5.5–5.7 and 6.1–6.4 in the OpenSpec task tracker to reflect the implemented endpoints and tests.

openspec/changes/rhess-enterprise-skills-server/tasks.md

@qodo-code-review

qodo-code-review Bot commented Jun 19, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (0) 📎 Requirement gaps (0) 🎨 UX issues (0) 🔗 Cross-repo conflicts (0) 📜 Skill insights (0)

Grey Divider


Action required

1. Sync guard not atomic ✓ Resolved 🐞 Bug ☼ Reliability
Description
POST /api/v1/sources/:id/sync uses a non-atomic check-then-set (`if (source.syncStatus ===
"syncing") then updateSync("syncing")`), so two concurrent requests can both pass the check and
start ingestion.
Code

src/server/routes/sources.ts[R139-146]

+    if (source.syncStatus === "syncing") {
+      return reply.code(409).send({
+        error: { code: "SYNC_IN_PROGRESS", message: "A sync is already in progress for this source" },
+      });
+    }
+
+    repos.sources.updateSync({ id, status: "syncing" });
+
Evidence
The route’s concurrency check is a separate read before an unconditional update; the repository
update does not guard against concurrent updates, so two requests can interleave and both start
syncing.

src/server/routes/sources.ts[132-146]
src/server/db/SqliteSourceRepository.ts[52-56]
src/server/db/SqliteSourceRepository.ts[82-91]
openspec/changes/rhess-enterprise-skills-server/specs/skill-source-management/spec.md[60-63]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The sync-in-progress protection is racy: two requests can read `sync_status=idle` and both proceed to set it to `syncing`. This violates the spec scenario “Concurrent sync rejected” (409).

### Issue Context
`SqliteSourceRepository.updateSync()` performs an unconditional UPDATE, so there is no DB-level guard.

### Fix Focus Areas
- src/server/routes/sources.ts[123-158]
- src/server/db/SqliteSourceRepository.ts[52-56]
- src/server/db/SqliteSourceRepository.ts[82-91]
- openspec/changes/rhess-enterprise-skills-server/specs/skill-source-management/spec.md[60-63]

### Implementation notes
- Add a repo method like `tryStartSync(id): boolean` implemented as:
 - `UPDATE sources SET sync_status='syncing', sync_error=NULL WHERE id=? AND sync_status!='syncing'`
 - check `changes === 1`; if 0, return 409 `SYNC_IN_PROGRESS`.
- Use the atomic method in the route instead of check-then-set.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Skill files not returned ✓ Resolved 🐞 Bug ≡ Correctness
Description
GET /api/v1/skills/:source/:slug always returns a single files entry for SKILL.md and uses
skill.content as the contents, so supporting files are never returned and archive skills will
return base64 tar.gz data as if it were SKILL.md.
Code

src/server/routes/skills.ts[R117-119]

+      const files: Array<{ path: string; contents: string }> = [
+        { path: "SKILL.md", contents: skill.content },
+      ];
Evidence
The route currently hardcodes a single SKILL.md entry and ignores supporting files, while the spec
requires the complete file tree and ingestion stores base64 tar.gz for archive skills.

src/server/routes/skills.ts[117-130]
openspec/changes/rhess-enterprise-skills-server/specs/skills-catalog-api/spec.md[25-33]
src/server/ingestion/bundle.ts[7-13]
src/server/ingestion/bundle.ts[53-57]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`GET /api/v1/skills/:source/:slug` must return the complete `files` array (SKILL.md + all supporting files) per OpenSpec. Current implementation returns only `SKILL.md` and uses `skill.content` as its contents, which is incorrect for `artifactType: "archive"` because ingestion stores base64-encoded tar.gz.

### Issue Context
- Spec requires `files` to include SKILL.md and all supporting files.
- Ingestion bundles multi-file skills as a tar.gz archive and stores it as base64 in `content`.

### Fix Focus Areas
- src/server/routes/skills.ts[101-131]
- src/server/ingestion/bundle.ts[7-57]
- src/server/db/types.ts[11-27]
- openspec/changes/rhess-enterprise-skills-server/specs/skills-catalog-api/spec.md[25-33]

### Implementation notes
- If `skill.artifactType === "skill-md"`: return `[{path:"SKILL.md", contents: skill.content}]` (as today).
- If `skill.artifactType === "archive"`: decode `skill.content` from base64, extract tar.gz entries, and return `files: [{path, contents}, ...]` (ensure SKILL.md is included).
- Update/clarify the `Skill.content` docstring in `src/server/db/types.ts` since it is not always “Raw SKILL.md content”.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

3. Search can return missing id ✓ Resolved 🐞 Bug ☼ Reliability
Description
GET /api/v1/skills/search sets id: skill?.id after a DB lookup per search hit, so during periods
when the in-memory search index is stale relative to the DB (e.g., after atomicSwap but before
rebuildSearchIndex) the API can emit search results with id undefined.
Code

src/server/routes/skills.ts[R31-41]

+      const results = search.search(q.trim());
+      const data = results.map((r) => {
+        const skill = skills.findBySourceAndSlug(r.sourceSlug, r.slug);
+        return {
+          id: skill?.id,
+          source: r.sourceSlug,
+          slug: r.slug,
+          name: r.name,
+          description: r.description,
+          score: r.score,
+        };
Evidence
The search route makes id optional via skill?.id, and ingestion updates the DB in a transaction
before the index rebuild step runs, creating a real window where index entries may not map to DB
rows.

src/server/routes/skills.ts[31-41]
src/server/ingestion/ingest.ts[95-117]
src/server/routes/sources.ts[147-151]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The search endpoint can return results with missing `id` because it cross-references the DB for every Fuse hit, and the index rebuild happens after DB changes.

### Issue Context
- Search response uses `id: skill?.id`.
- Sync ingestion updates DB via `atomicSwap()`; index rebuild happens only after ingestion completes.

### Fix Focus Areas
- src/server/routes/skills.ts[31-45]
- src/server/routes/sources.ts[147-151]
- src/server/ingestion/ingest.ts[95-117]
- src/server/search/types.ts[1-21]
- src/server/search/FuseSearchProvider.ts[4-26]

### Implementation notes
Preferred: include `id` in the search index items/results (extend `SearchIndexItem`/`SearchResult`) so the search route doesn’t need a DB lookup.
Alternative: if keeping the DB lookup, filter out hits where `skill` is missing to avoid returning `id: undefined`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


4. Sync timestamps become incorrect ✓ Resolved 🐞 Bug ≡ Correctness
Description
updateSync() sets last_synced_at for any status other than "syncing" (including "error") and
sets it to null when entering "syncing", so this route will overwrite the last successful sync
timestamp on both sync start and sync failure.
Code

src/server/routes/sources.ts[R145-155]

+    repos.sources.updateSync({ id, status: "syncing" });
+
+    try {
+      const syncReport = await ingestSource(source.id, source.slug, source.url, repos);
+      repos.sources.updateSync({ id, status: "idle", error: null });
+      if (searchProvider) rebuildSearchIndex(repos, searchProvider);
+      return reply.code(200).send(syncReport);
+    } catch (err) {
+      const message = err instanceof Error ? err.message : String(err);
+      repos.sources.updateSync({ id, status: "error", error: message });
+      return reply.code(422).send({
Evidence
The route explicitly sets statuses syncing/error, and the repository implementation writes
last_synced_at on error and clears it on syncing, causing incorrect timestamps.

src/server/routes/sources.ts[145-157]
src/server/db/SqliteSourceRepository.ts[82-91]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
Source sync metadata is incorrect:
- entering `syncing` clears `last_synced_at`
- failing a sync (`status='error'`) updates `last_synced_at` even though the sync was not successful

### Issue Context
The sync route uses `repos.sources.updateSync({status:'syncing'|'idle'|'error'})` and the repository sets `last_synced_at` based only on status!=syncing.

### Fix Focus Areas
- src/server/db/SqliteSourceRepository.ts[82-91]
- src/server/routes/sources.ts[145-157]

### Implementation notes
- Preserve existing `last_synced_at` when transitioning to `syncing`.
- Update `last_synced_at` only when a sync succeeds (i.e., when setting status back to `idle` after successful ingestion).
- Do not update `last_synced_at` when setting status to `error`.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Ingest errors mislabeled ✓ Resolved 🐞 Bug ◔ Observability
Description
POST /api/v1/sources returns {code:"CLONE_FAILED"} even when the clone succeeded but ingestion
failed, which makes client error handling and alerting ambiguous.
Code

src/server/routes/sources.ts[R90-96]

+    } catch (err) {
+      // Ingestion itself failed — roll back the source record.
+      repos.sources.delete(source.id);
+      const message = err instanceof Error ? err.message : String(err);
+      return reply.code(422).send({
+        error: { code: "CLONE_FAILED", message },
+      });
Evidence
The code path explicitly handles ingestion failure after a successful clone, but still returns the
CLONE_FAILED error code.

src/server/routes/sources.ts[78-97]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The registration endpoint conflates ingestion failures with clone failures by always returning `CLONE_FAILED` from the ingestion catch block.

### Issue Context
This catch block is specifically labeled as an ingestion failure rollback.

### Fix Focus Areas
- src/server/routes/sources.ts[78-97]

### Implementation notes
- Return a distinct code for ingestion failures (e.g., `INGEST_FAILED`) and keep `CLONE_FAILED` strictly for clone errors.
- Consider preserving the rollback behavior but improving error classification.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (1)
6. Clone test is non-hermetic ✓ Resolved 🐞 Bug ☼ Reliability
Description
The new clone-failure test triggers a real git clone to an invalid domain, making the test suite
dependent on external networking/DNS behavior and potentially slow/flaky in CI.
Code

test/server/routes/sources.test.ts[R93-113]

+  it("6.4.3 — clone failure → 422 CLONE_FAILED, no orphan source in DB", async () => {
+    const countBefore = repos.sources.findAll().length;
+
+    const res = await app.inject({
+      method: "POST",
+      url: "/api/v1/sources",
+      headers: { "content-type": "application/json" },
+      payload: {
+        url: "https://rhess-test-nonexistent.invalid/repo.git",
+        slug: "clone-fail-test",
+      },
+    });
+
+    expect(res.statusCode).toBe(422);
+    const body = res.json();
+    expect(body.error.code).toBe("CLONE_FAILED");
+
+    // Source record must have been deleted — no orphan
+    expect(repos.sources.findAll().length).toBe(countBefore);
+    expect(repos.sources.findBySlug("clone-fail-test")).toBeUndefined();
+  }, 30_000);
Evidence
The test sends a real remote URL and the clone implementation invokes simple-git clone; without
mocking, test behavior depends on network environment.

test/server/routes/sources.test.ts[93-113]
src/server/ingestion/clone.ts[6-19]
src/server/routes/sources.ts[65-76]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The test `6.4.3 — clone failure` depends on making a real clone attempt to a network URL, which can flake or time out in restricted CI environments.

### Issue Context
The route calls `clone()` which calls `simpleGit().clone()`; the test does not stub/mocking this call.

### Fix Focus Areas
- test/server/routes/sources.test.ts[93-113]
- src/server/ingestion/clone.ts[6-19]
- src/server/routes/sources.ts[65-76]

### Implementation notes
- Use `vi.mock()` to mock `../ingestion/clone.js` and force it to throw a deterministic error.
- Alternatively, refactor `sourcesPlugin` to accept a `cloneFn` dependency for easier testing.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Comment thread src/server/routes/skills.ts Outdated
Comment thread src/server/routes/sources.ts Outdated
johnmcollier and others added 2 commits June 19, 2026 16:43
…ic sync, timestamps, ingest label, hermetic test

- findAllUnpaged(): new SkillRepository method for index builds; bypasses the
  100-row public pagination cap that made findAll({perPage:10000}) silently
  incomplete. Used in startup and every rebuildSearchIndex() call.

- id in SearchIndexItem/SearchResult: embed the skill DB id directly in the
  Fuse index so the search route no longer does a secondary DB lookup per hit
  (which could return undefined when the index is stale). Both search types.ts
  and FuseSearchProvider updated.

- Archive detail: GET /skills/:source/:slug now correctly expands archive-type
  skills (base64 tar.gz) into files:[{path,contents}] on demand instead of
  returning the raw base64 blob as SKILL.md content.

- Atomic sync guard: replace non-atomic check-then-set with
  trySetSyncing() — a single SQL UPDATE...WHERE sync_status != 'syncing' that
  returns the affected row count. Two concurrent requests can no longer both
  win the race and start ingestion.

- Sync timestamps: updateSync now uses a CASE expression so last_synced_at is
  only written on successful completion (status='idle'); entering 'syncing' or
  transitioning to 'error' no longer overwrites or clears the last good timestamp.

- INGEST_FAILED: ingestion failures after a successful clone now return
  {code:"INGEST_FAILED"} instead of CLONE_FAILED, making client error handling
  and alerting unambiguous.

- Hermetic clone test: vi.mock() replaces the real git clone with a
  deterministic rejection so the test suite doesn't require external networking.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@johnmcollier johnmcollier merged commit 88550a9 into redhat-ai-dev:main Jun 19, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant