From 35ccfa7e87e7f2d4091bf806ea53a7b28c0bd270 Mon Sep 17 00:00:00 2001 From: Karina Barbara Kalicka-Molin Date: Thu, 14 May 2026 18:14:18 +0200 Subject: [PATCH 1/5] feat(skills): add manage-skills skill (RES-793) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit CRUD workflow for the orq.ai Skills entity (list/get/create/update/delete via list_skills, get_skill, create_skill, update_skill, delete_skill — backed by /v2/skills REST endpoints), plus authoring guidance, governance, and workarounds for known platform caveats. Includes: - SKILL.md with 5 phases (list / get / create / update / delete + orphan cleanup). Phase 5 uses a warn-then-offer flow with two explicit consents (one for delete, one for the orphan-cleanup pass) — never auto-prunes. - resources/authoring-guide.md — naming, description, tags, project scoping - resources/governance-guide.md — wiring Skills to agents via agent.skills[], ownership, lifecycle, audit checklist - resources/known-caveats.md — INN-2861 (orphaned skill ids in agent.skills[] after DELETE /v2/skills/{id}), INN-2836 (empty skill.version AND unstamped skill.doc after snippet→skill migration), ENG-1604 (+NEVER+ prose constraints treated as soft suggestions; recommends MCP tool gates) - /manage-skills slash command — routes to phases by argument - AGENTS.md, README.md skills table, and tests/skills.md + tests/commands.md smoke-test scenarios Version bumped 0.0.2 → 0.1.0 across all 4 plugin manifests (MINOR per CLAUDE.md rules for a new skill). CHANGELOG.md entry added. --- .claude-plugin/plugin.json | 2 +- .codex-plugin/plugin.json | 2 +- .cursor-plugin/plugin.json | 2 +- CHANGELOG.md | 9 + README.md | 1 + agents/AGENTS.md | 3 + commands/manage-skills.md | 41 ++++ skills/manage-skills/SKILL.md | 200 ++++++++++++++++++ .../resources/authoring-guide.md | 98 +++++++++ .../resources/governance-guide.md | 104 +++++++++ .../manage-skills/resources/known-caveats.md | 124 +++++++++++ tests/commands.md | 8 + tests/skills.md | 41 ++++ 13 files changed, 632 insertions(+), 3 deletions(-) create mode 100644 commands/manage-skills.md create mode 100644 skills/manage-skills/SKILL.md create mode 100644 skills/manage-skills/resources/authoring-guide.md create mode 100644 skills/manage-skills/resources/governance-guide.md create mode 100644 skills/manage-skills/resources/known-caveats.md diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json index 59afe96..b7684a9 100644 --- a/.claude-plugin/plugin.json +++ b/.claude-plugin/plugin.json @@ -1,6 +1,6 @@ { "name": "orq", - "version": "0.0.2", + "version": "0.1.0", "description": "Agent skills for building, deploying, evaluating, and monitoring LLM pipelines on the orq.ai platform.", "author": { "name": "orq.ai", diff --git a/.codex-plugin/plugin.json b/.codex-plugin/plugin.json index 519d7c3..38a41b4 100644 --- a/.codex-plugin/plugin.json +++ b/.codex-plugin/plugin.json @@ -1,6 +1,6 @@ { "name": "orq", - "version": "0.0.2", + "version": "0.1.0", "description": "Agent skills for building, deploying, evaluating, and monitoring LLM pipelines on the orq.ai platform.", "author": { "name": "orq.ai", diff --git a/.cursor-plugin/plugin.json b/.cursor-plugin/plugin.json index 3a175c1..99c7f64 100644 --- a/.cursor-plugin/plugin.json +++ b/.cursor-plugin/plugin.json @@ -1,7 +1,7 @@ { "name": "orq", "displayName": "orq.ai", - "version": "0.0.2", + "version": "0.1.0", "description": "Agent skills for building, deploying, evaluating, and monitoring LLM pipelines on the orq.ai platform.", "author": { "name": "orq.ai", diff --git a/CHANGELOG.md b/CHANGELOG.md index de563f9..6a6ebe7 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,15 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.1.0] - 2026-05-14 + +### Added +- `manage-skills` skill — CRUD workflow for the orq.ai Skills entity (list, get, create, update, delete) plus authoring guidance (naming, description, tags, project scoping), governance (wiring Skills to agents via `agent.skills[]`), and platform-caveat workarounds. +- `manage-skills`: warn-then-offer flow for the post-delete orphan-reference cleanup pass (INN-2861 workaround) — never auto-prunes `agent.skills[]`, always asks for separate explicit consent before writing to referencing agents. +- `manage-skills`: defensive handling for INN-2836 — treats empty/missing `skill.version` *and* unstamped `skill.doc` (both side-effects of the Snippet→Skill migration) as valid states (surfaced as `(unset)`). +- `manage-skills`: anti-pattern guidance against `+NEVER+` / "you MUST refuse" prose constraints in Skill bodies (ENG-1604) — recommends MCP tool gates for hard guardrails. +- `/manage-skills` slash command — routes to list/get/create/update/delete phases. + ## [0.0.2] - 2026-04-21 ### Added diff --git a/README.md b/README.md index d6e4b17..d2b4ab6 100644 --- a/README.md +++ b/README.md @@ -171,6 +171,7 @@ Skills are triggered by describing what you need. Claude picks the right skill a | **compare-agents** | Run cross-framework agent comparisons using evaluatorq — compare orq.ai, LangGraph, CrewAI, OpenAI Agents SDK, and others | [SKILL.md](skills/compare-agents/SKILL.md) | | **generate-synthetic-dataset** | Generate and curate evaluation datasets — structured generation, quick from description, expansion, and dataset maintenance | [SKILL.md](skills/generate-synthetic-dataset/SKILL.md) | | **optimize-prompt** | Analyze and optimize system prompts using a structured prompting guidelines framework | [SKILL.md](skills/optimize-prompt/SKILL.md) | +| **manage-skills** | Manage orq.ai Skills (the platform entity) — list/get/create/update/delete, authoring guidance, governance (`agent.skills[]` wiring), and platform-caveat workarounds | [SKILL.md](skills/manage-skills/SKILL.md) | --- diff --git a/agents/AGENTS.md b/agents/AGENTS.md index e31b6da..ee22a73 100644 --- a/agents/AGENTS.md +++ b/agents/AGENTS.md @@ -14,6 +14,7 @@ These skills are: - compare-agents -> "skills/compare-agents/SKILL.md" - generate-synthetic-dataset -> "skills/generate-synthetic-dataset/SKILL.md" - invoke-deployment -> "skills/invoke-deployment/SKILL.md" + - manage-skills -> "skills/manage-skills/SKILL.md" - optimize-prompt -> "skills/optimize-prompt/SKILL.md" - run-experiment -> "skills/run-experiment/SKILL.md" - setup-observability -> "skills/setup-observability/SKILL.md" @@ -40,6 +41,8 @@ compare-agents: `Run cross-framework agent comparisons using evaluatorq — comp setup-observability: `Set up orq.ai observability for LLM applications — AI Router proxy, OpenTelemetry, tracing setup, and trace enrichment. Use when setting up tracing, adding the AI Router proxy, integrating OpenTelemetry, auditing existing instrumentation, or enriching traces with metadata. Do NOT use when traces already exist and you need to debug failures (use analyze-trace-failures).` +manage-skills: `Manage orq.ai Skills (the platform entity) end-to-end — list, get, create, update, and delete Skills, plus authoring guidance (naming, description, tags, project scoping), governance (wiring Skills to agents via agent.skills[]), and workarounds for known platform caveats (orphaned references, empty version, +NEVER+ prose anti-pattern). Use when the user wants to create, audit, edit, retire, or wire up orq.ai Skills.` + Paths referenced within SKILL folders are relative to that SKILL. For example the build-evaluator `resources/judge-prompt-template.md` would be referenced as `skills/build-evaluator/resources/judge-prompt-template.md`. diff --git a/commands/manage-skills.md b/commands/manage-skills.md new file mode 100644 index 0000000..f8891e7 --- /dev/null +++ b/commands/manage-skills.md @@ -0,0 +1,41 @@ +--- +description: Manage orq.ai Skills — list, get, create, update, or delete Skills (the platform entity) and wire them to agents +argument-hint: [list|get|create|update|delete] [name-or-id] +allowed-tools: AskUserQuestion, orq* +--- + +# Manage Skills + +Quick entry point into the `manage-skills` skill. Routes to the right phase based on the first argument, or asks if no argument is given. + +## Instructions + +### 1. Parse arguments + +`$ARGUMENTS` may contain an action and optionally a Skill name/id: + +- `list` — Phase 1 (list / audit) +- `get ` — Phase 2 (inspect a Skill) +- `create` — Phase 3 (create a new Skill) +- `update ` — Phase 4 (edit an existing Skill) +- `delete ` — Phase 5 (delete + orphan cleanup) + +If `$ARGUMENTS` is empty, ask the user which action they want via `AskUserQuestion` and offer the five choices above. + +If `$ARGUMENTS` contains an action that requires a name/id but none was provided (e.g., `get`, `update`, `delete`), call `list_skills` first and ask the user to pick. + +### 2. Delegate to `manage-skills` + +Read `skills/manage-skills/SKILL.md` and execute the matching phase. Pass the parsed name/id along. + +### 3. Safety rails + +- **Never** auto-execute `delete_skill` from this command — always route through Phase 5's two-step warn-then-confirm flow. +- **Never** auto-prune `agent.skills[]` after a delete — Phase 5 offers it as a separate consent. +- **Always** confirm project scope before `create_skill`. + +### 4. Error handling + +- **Auth errors** — "Authentication failed. Check that your `ORQ_API_KEY` is valid." +- **Skill-tool unavailable** — "The orq MCP server doesn't expose `*_skill` tools in this workspace. Falling back to REST `/v2/skills` — confirm before proceeding." +- **MCP unreachable** — "Could not reach the orq.ai MCP server. Make sure it's configured: `claude mcp add --transport http orq-workspace https://my.orq.ai/v2/mcp --header 'Authorization: Bearer ${ORQ_API_KEY}'`" diff --git a/skills/manage-skills/SKILL.md b/skills/manage-skills/SKILL.md new file mode 100644 index 0000000..57ef97e --- /dev/null +++ b/skills/manage-skills/SKILL.md @@ -0,0 +1,200 @@ +--- +name: manage-skills +description: > + Manage orq.ai Skills (the platform entity) end-to-end — list, get, create, + update, and delete Skills, plus authoring guidance (naming, description, + tags, project scoping), governance (wiring Skills to agents via + `agent.skills[]`), and workarounds for known platform caveats. Use when + the user wants to create, audit, edit, retire, or wire up orq.ai Skills. +allowed-tools: Bash, Read, Write, Edit, Grep, Glob, WebFetch, Task, AskUserQuestion, orq* +--- + +# Manage Skills + +You are an **orq.ai Skills lifecycle specialist**. Your job is the full CRUD workflow for the **Skills entity on the orq.ai platform** — not the SKILL.md files in this repo, but the user-authored Skills that live in their orq.ai workspace and get attached to agents via `agent.skills[]`. + +A well-managed Skill is: +- **Discoverable** — name and description make it obvious when to apply it +- **Scoped** — tagged and assigned to the right project (workspace-wide only when truly reusable) +- **Versioned** — changes go through update flows, not silent overwrites +- **Wired** — referenced from `agent.skills[]` on every agent that should use it +- **Pruned** — orphan references and stale versions get cleaned up + +## When to use + +- User wants to list, audit, or search Skills in their workspace +- User wants to create a new Skill on orq.ai +- User wants to edit an existing Skill's description, tags, or body +- User wants to delete a Skill and clean up references on agents +- User asks how to attach a Skill to an agent (`agent.skills[]`) +- User asks for naming, tagging, or scoping guidance for Skills +- User hits the orphaned-reference bug (INN-2861) after deleting a Skill +- User reads a Skill programmatically and gets an empty `version` or unstamped `doc` (INN-2836) + +## When NOT to use + +- **Need to build the agent itself?** → `build-agent` +- **Need to invoke a deployment or agent?** → `invoke-deployment` +- **Need to evaluate the agent that uses the Skill?** → `run-experiment` +- **Need to optimize the prompt/instructions inside a Skill?** → `optimize-prompt` +- **Debugging why an agent ignored a Skill?** → `analyze-trace-failures` + +## Companion Skills + +- `build-agent` — create or edit the agents that reference these Skills +- `optimize-prompt` — improve a Skill's body/instructions before saving +- `run-experiment` — verify a Skill change improves agent behavior +- `analyze-trace-failures` — diagnose Skills that aren't firing in production + +## Constraints + +- **ALWAYS** confirm the project scope (workspace-wide vs project-scoped) before `create_skill`. Default to project-scoped unless the user is explicit. +- **ALWAYS** read the current Skill with `get_skill` before `update_skill` — never blind-overwrite tags, description, or body. +- **ALWAYS** warn the user about orphaned references after `delete_skill`, then **offer** the orphan-cleanup pass — never auto-prune without explicit consent (the pass writes to other agents and is not the user's literal delete request). Tracked upstream as INN-2861. +- **ALWAYS** treat empty `skill.version` and unstamped `skill.doc` as valid (INN-2836) — fall back to `null`/sentinel for both, never crash. +- **NEVER** rely on `+NEVER+` (or any prose negation) inside a Skill body as a hard guardrail. Skill bodies are *soft* hints to the model (tracked upstream as ENG-1604). Hard constraints belong in **MCP tool gates** (refuse the call at the tool layer). See [resources/known-caveats.md](resources/known-caveats.md). +- **NEVER** delete a Skill before listing which agents reference it — the user needs that list to decide whether to proceed. + +**Why these constraints:** Skills are shared infrastructure. A bad name pollutes the workspace, a missing project tag leaks Skills across teams, an unscoped overwrite loses someone else's edits, and an undeleted reference produces broken agents that fail at runtime — not at delete time. + +## orq.ai Documentation + +> **Skills overview:** https://docs.orq.ai/docs/skills/overview +> **Skills API:** https://docs.orq.ai/reference/skills +> **Wiring Skills to Agents:** https://docs.orq.ai/docs/agents/skills + +### orq MCP Tools + +| Tool | Purpose | +|------|---------| +| `list_skills` | List Skills, filterable by project and tags | +| `get_skill` | Fetch a single Skill by id or key (with version, body, tags) | +| `create_skill` | Create a new Skill (name, description, tags, project, body) | +| `update_skill` | Patch an existing Skill (description, tags, body, version) | +| `delete_skill` | Delete a Skill — **does NOT prune `agent.skills[]` references** (see INN-2861) | +| `search_entities` | Cross-entity search; use `type: "agent"` to find agents that reference a Skill | +| `get_agent` / `update_agent` | Required for the post-delete orphan-cleanup workflow | + +> **Tool discovery:** Before the first run, list the connected MCP server's tools (e.g., `/mcp` in Claude Code, or inspect via the client) and confirm the `*_skill` tools above exist. Tool names sometimes vary by workspace or MCP server version. +> +> **REST fallback:** All five tools are backed by REST endpoints (verified against `/openapi/openapi.json`): `GET /v2/skills` (SkillList), `GET /v2/skills/{skill_id}` (SkillGet), `POST /v2/skills` (SkillCreate), `PATCH /v2/skills/{skill_id}` (SkillUpdate), `DELETE /v2/skills/{skill_id}` (SkillDelete). Use these directly with `Authorization: Bearer ${ORQ_API_KEY}` if the MCP tools aren't exposed in the connected workspace. + +## Resources + +- **Authoring guide** (naming, description, tags, project scoping): See [resources/authoring-guide.md](resources/authoring-guide.md) +- **Governance** (wiring `agent.skills[]`, ownership, lifecycle): See [resources/governance-guide.md](resources/governance-guide.md) +- **Known caveats** (INN-2861, INN-2836, ENG-1604 anti-pattern): See [resources/known-caveats.md](resources/known-caveats.md) + +## Prerequisites + +- The orq.ai MCP server is connected (`/orq:quickstart` to verify) +- `ORQ_API_KEY` is set +- The user knows which **project** the Skill belongs to (run `search_directories` if not) + +--- + +## Workflow + +Pick the phase that matches the user's intent. Most sessions are a single phase; the **delete** phase always pairs with the orphan-cleanup workflow. + +### Phase 1: List / audit + +Use when the user wants visibility into existing Skills. + +1. Call `list_skills`. Optional filters from the user: + - `project` — narrow to a single project's Skills + - `tags` — filter by one or more tags + - `q` / `name` — substring match on name +2. Present a scannable table: + ``` + Skills (12) + - customer-support-tone (project: cs, tags: tone, voice) — v3 + - extract-receipt-fields (project: finance, tags: extraction) — v1 + - refund-policy (workspace-wide, tags: policy, cs) — v2 ⚠ used by 4 agents + ... + ``` +3. For each Skill, surface: name, project (or "workspace-wide"), tags, latest version, and reference count (run `search_entities` with `type: "agent"` and filter agents whose `skills[]` includes this Skill's id — cache the result for the session). +4. If the user asks "which Skill should I edit?" — show the list with usage counts and let them pick. + +### Phase 2: Get / inspect + +Use before any update or delete, and whenever the user asks "what does Skill X do?" + +1. Call `get_skill(id_or_key=...)`. +2. Display: name, description, tags, project, version, body (truncated), and the list of agents that reference it (`search_entities` + filter on `skills[]`). +3. **Empty version / doc handling (INN-2836):** if `skill.version` is missing/empty, display `version: (unset)`. If `skill.doc` is missing/empty (common for Skills migrated from snippets), display `doc: (unset)`. Do not error on either. + +### Phase 3: Create + +Use when the user wants a new Skill. + +1. **Gather inputs** via `AskUserQuestion`: + - **Name** (kebab-case, ≤50 chars, verb-noun preferred — see [authoring-guide](resources/authoring-guide.md)) + - **Description** — one sentence describing *when the model should use the Skill*, not what it does internally (model uses this for retrieval/selection) + - **Tags** — at least one functional tag; reuse existing tags where possible (`list_skills` to see in-use tags) + - **Project scope** — project key OR workspace-wide. Default to **project-scoped**; confirm before going workspace-wide. + - **Body** — the actual instructions/content. Keep it focused on one capability. +2. **Validate** before submitting: + - Name is unique within the chosen scope (check via `list_skills`) + - Description starts with "Use when…" or describes a trigger condition + - Body does NOT rely on `+NEVER+` / "always refuse" prose for hard guardrails — link the user to [known-caveats](resources/known-caveats.md) (ENG-1604) and recommend an MCP tool gate instead +3. Call `create_skill` with the validated payload. +4. Echo back the new Skill's id, version, and a one-line summary. Ask whether to wire it into any agents now (jumps to [governance](resources/governance-guide.md)). + +### Phase 4: Update + +Use when the user wants to edit an existing Skill. + +1. **Always `get_skill` first.** Show the current state and confirm the diff the user is about to apply. +2. **Patch fields explicitly.** Only send the fields being changed (`update_skill` is a patch — don't echo back unchanged tags or body unless you have to). +3. **Body changes:** if the user is rewriting the instructions, route through `optimize-prompt` first to catch unclear language and remove `+NEVER+`-style soft constraints (ENG-1604). +4. **Version bumps:** `update_skill` typically increments `version` automatically. If the workspace handles versioning manually, ask the user whether this is a patch/minor/major change. +5. **Verify** by calling `get_skill` post-update and confirming the change landed. +6. If any agent references this Skill, mention that the next agent run will pick up the new version automatically — no `update_agent` needed. + +### Phase 5: Delete + orphan cleanup + +Use when the user wants to retire a Skill. **This is the most error-prone phase — follow every step.** + +The delete itself is one action; the orphan-cleanup pass is a separate, opt-in action that writes to other agents. Confirm them independently. + +1. **List referencing agents.** Run `search_entities` with `type: "agent"`, then filter to agents whose `skills[]` includes the target Skill's id. Capture this list now — once the Skill is deleted, resolving its id back to a name gets harder. +2. **Warn and confirm the delete.** Show the user: + - The Skill's name, id, and project scope + - The list of N agents that reference it (or "no referencing agents found") + - The INN-2861 caveat: agents will retain a dangling id until pruned + Ask: *"Delete this Skill? (You'll be asked separately whether to prune orphan references on the N agents.)"* Default to **cancel** if the user hesitates. +3. **Delete.** On confirmation, call `delete_skill(id=...)`. +4. **Offer the orphan-cleanup pass.** Only if step 1 found ≥1 referencing agents, ask: *"Prune the deleted Skill's id from these N agents' `skills[]` arrays now?"* This is a **second, explicit consent** — never auto-prune. + - If the user says **yes**, run the cleanup: + ``` + skill_id = + for agent in referencing_agents: + current = get_agent(key=agent.key) + pruned = [s for s in current.skills if id_of(s) != skill_id] + update_agent(key=agent.key, skills=pruned) + # verify: re-get and confirm skill_id is gone + ``` + Verify each `update_agent` returns success before moving on. + - If the user says **no** or wants to defer, summarize the orphan list and recommend they prune later (or hand off to whoever owns those agents). +5. **Report.** Summarize: Skill deleted; orphan cleanup either completed (N agents pruned) or skipped (N agents still reference the deleted id — list them). +6. Note the INN-2861 workaround in your reply so the user understands *why* the extra step exists. + +See [resources/known-caveats.md](resources/known-caveats.md) for the full caveat context. + +--- + +## Done When + +- The user's intent (list / get / create / update / delete) is fully resolved +- Any `delete_skill` was followed by an offered (and either completed or explicitly deferred) orphan-cleanup pass — never silently skipped +- Body changes routed through `optimize-prompt` if they introduce or remove `+NEVER+`-style prose constraints (ENG-1604) +- New or updated Skills have a non-empty description, at least one tag, and an explicit project scope +- The user has a clear pointer to where the Skill is wired in (or a follow-up step to wire it) + +## Open in orq.ai + +- **Skills index:** [my.orq.ai](https://my.orq.ai/) → Skills +- **Agent skill bindings:** [my.orq.ai](https://my.orq.ai/) → Agents → (select agent) → Skills tab + +When this skill conflicts with live API responses or docs.orq.ai, trust the API. diff --git a/skills/manage-skills/resources/authoring-guide.md b/skills/manage-skills/resources/authoring-guide.md new file mode 100644 index 0000000..94f028c --- /dev/null +++ b/skills/manage-skills/resources/authoring-guide.md @@ -0,0 +1,98 @@ +# Authoring Guide: Naming, Description, Tags, Project Scoping + +How to author an orq.ai Skill so it's discoverable, scoped correctly, and picked up by the right agents. + +--- + +## Naming + +The Skill name is the primary handle agents and humans use to refer to the Skill. It should be unambiguous on its own. + +**Rules:** +- **kebab-case**, lowercase, ASCII only — e.g., `extract-receipt-fields` +- **≤50 characters** — long names get truncated in agent configs and UI tables +- **Verb-noun preferred** — `summarize-ticket`, `classify-intent`, `extract-pii` +- **Avoid generic verbs** alone — `handle-thing`, `do-task`, `process` say nothing +- **No version suffixes in the name** — `summarize-ticket-v2` is an anti-pattern; the platform tracks versions on the Skill itself +- **Unique within scope** — names must be unique within a project (and across the workspace for workspace-wide Skills) + +**Good:** +- `extract-invoice-line-items` +- `redact-pii-from-transcript` +- `format-currency-eur` + +**Bad:** +- `helper` (too vague) +- `MySkill_v2` (camelCase + version suffix) +- `the-skill-that-handles-customer-support-emails-with-tone-checking` (too long) + +--- + +## Description + +The description is what the **model** reads when deciding whether to apply the Skill. Optimize for retrieval, not for human marketing copy. + +**Rules:** +- **Lead with the trigger condition** — start with "Use when…" or "Apply when…" +- **Name the input and the output** — e.g., "Use when given a raw email body. Returns a JSON object with sender, subject, and intent." +- **One sentence.** Skills with paragraph descriptions get truncated in agent prompts. +- **Avoid implementation detail.** The model doesn't need to know which library you use. +- **Avoid "always" / "never" / "must"** — those are constraints, not triggers. Put hard rules in tool gates, not Skill descriptions. + +**Good:** +> Use when the user provides a receipt image or PDF. Extracts merchant, total, tax, and line items into structured JSON. + +**Bad:** +> This skill is a powerful tool that helps you handle receipts in many different formats using OCR. +> *(no trigger, marketing voice, implementation leak)* + +--- + +## Tags + +Tags are how Skills get filtered in `list_skills` and how Skills are grouped in the UI. Good tagging makes a workspace navigable; bad tagging makes Skills invisible. + +**Rules:** +- **At least one tag.** Untagged Skills don't show up in filtered views. +- **Reuse existing tags.** Run `list_skills` and see which tags are already in use before inventing a new one. Tag sprawl is the silent killer of Skill discoverability. +- **Two axes of tagging are usually enough:** + - **Functional** — what the Skill *does*: `extraction`, `summarization`, `classification`, `formatting`, `tone`, `policy` + - **Domain** — where it applies: `finance`, `cs` (customer support), `legal`, `internal` +- **Avoid agent-specific tags.** A tag like `used-by-checkout-agent` becomes wrong the moment a second agent adopts the Skill — use `agent.skills[]` for that wiring instead. +- **Lowercase, kebab-case** for consistency. + +**Recommended tag count:** 1–4 tags per Skill. More than 5 tags usually means the Skill is doing too many things. + +--- + +## Project Scoping + +Every Skill is either **project-scoped** (lives inside one project) or **workspace-wide** (visible to every agent across the workspace). + +**Default to project-scoped.** Workspace-wide Skills are shared infrastructure — every workspace member can see them, every agent can pull them in, and a bad edit affects everyone. + +**When project-scoped is right:** +- The Skill encodes project-specific business logic (e.g., a refund policy that only applies to the EU project) +- The Skill is still being iterated on and shouldn't be discoverable across teams yet +- Different projects need different versions of the same idea (e.g., `extract-receipt-fields` per region) + +**When workspace-wide is right:** +- The Skill is genuinely reusable across teams and projects (e.g., `redact-pii`, `format-currency`) +- The Skill has stabilized — at least one minor version, used by ≥2 agents, no recent breaking changes +- Ownership is clear (named owner in the description or tags) + +**How to choose:** + +1. Start project-scoped. +2. After the Skill has been stable for ≥2 weeks and used by ≥2 agents in the same project, ask: "would another project benefit from this?" +3. If yes, **copy** to workspace-wide (don't move — agents in the original project still reference the project-scoped id). Then sunset the original after agents are re-wired. + +--- + +## Body / Instructions + +Beyond the metadata above, the Skill body is the actual content the agent reads. Keep it: +- **Focused on one capability.** If you find yourself writing "and also…", split into two Skills. +- **Specific.** Include 1–2 input/output examples. +- **Free of hard constraints expressed as prose.** Don't write "NEVER do X" or "you MUST refuse Y" — those are soft hints, not enforcement. See [known-caveats.md](known-caveats.md#anti-pattern-never-prose-constraints). +- **Routed through `optimize-prompt`** when in doubt. That skill will catch unclear instructions and soft-constraint anti-patterns before the Skill ships. diff --git a/skills/manage-skills/resources/governance-guide.md b/skills/manage-skills/resources/governance-guide.md new file mode 100644 index 0000000..dc135df --- /dev/null +++ b/skills/manage-skills/resources/governance-guide.md @@ -0,0 +1,104 @@ +# Governance Guide: Wiring Skills to Agents, Ownership, Lifecycle + +How Skills get attached to agents, who owns them, and how they retire. + +--- + +## Wiring Skills to Agents + +Skills don't fire on their own — an agent has to reference them. The reference lives on the agent in the `skills[]` array. + +### Adding a Skill to an agent + +1. `get_agent(key=)` — capture the current `skills[]` list. +2. Append the new Skill's id (or key, depending on the agent schema) to the list. +3. `update_agent(key=, skills=)`. +4. Verify with `get_agent` that the new entry is present. + +**Example (pseudo):** +``` +agent = get_agent(key="customer-support") +agent.skills.append({ id: "skl_abc123" }) # or { key: "refund-policy" } +update_agent(key="customer-support", skills=agent.skills) +``` + +> **Schema note:** the `skills[]` entry shape (id vs key vs object) depends on the agent API version. Always pattern-match what `get_agent` returned and write back the same shape. + +### Removing a Skill from an agent + +Same pattern, but filter out the unwanted Skill before `update_agent`. **This is the workaround for INN-2861** when a Skill is deleted — see [known-caveats.md](known-caveats.md). + +### Bulk wiring + +If a Skill needs to attach to many agents at once, list candidates first with `search_entities(type: "agent")`, ask the user to confirm the list, then iterate. Never blanket-attach without explicit confirmation — Skills change agent behavior in ways that are hard to roll back without trace analysis. + +--- + +## How agents select Skills at runtime + +The model picks Skills from the `skills[]` list based on the Skill **description** — *not* the name or tags. This is why authoring guidance pushes "Use when…" descriptions: they're the retrieval surface. + +Implications: +- A great Skill body with a vague description will rarely fire. +- Two Skills with similar descriptions cause the model to pick non-deterministically. +- Wiring 20+ Skills to a single agent dilutes the model's selection accuracy — keep `skills[]` lean. + +**Rule of thumb:** ≤8 Skills per agent. If you need more, the agent is probably doing too many things — split it. + +--- + +## Ownership + +There is no first-class "owner" field on a Skill today. Establish ownership conventions in tags and description: + +- **Tag** — add an `owner:` tag (e.g., `owner:cs-team`) to workspace-wide Skills. +- **Description** — for project-scoped Skills, ownership is implicit in the project. For workspace-wide, mention the owning team in the description's trailing context if it matters for incident response. + +Audit unowned workspace-wide Skills periodically with `list_skills` filtered to workspace scope — anything without an `owner:` tag is a candidate for review. + +--- + +## Lifecycle: Create → Iterate → Stabilize → Retire + +### Create +Always start project-scoped. Describe the trigger precisely. Wire to one agent first and verify in traces. + +### Iterate +- Iterate on body and description, not name. Name changes break references. +- Route every body change through `optimize-prompt`. +- After each meaningful change, run `run-experiment` against the agent that uses the Skill to confirm the change improves (or at least doesn't regress) behavior. + +### Stabilize +A Skill is stable when: +- It hasn't had a body change in ≥2 weeks +- It's referenced by ≥2 agents (or 1 production agent) +- No open incidents tag the Skill as a contributor + +At that point, consider promoting to workspace-wide if it's broadly reusable. See [authoring-guide.md](authoring-guide.md#project-scoping). + +### Retire +Retire a Skill when: +- The agent(s) using it are decommissioned, OR +- A replacement Skill covers the same capability better + +**Retirement workflow:** + +1. Identify all referencing agents (`search_entities(type: "agent")` + filter). +2. For each agent, decide: replace (swap in the new Skill) or remove (no replacement needed). +3. Wire replacements before deleting the old Skill, not after — atomicity matters. +4. Run `delete_skill`. +5. Run the orphan-cleanup pass on every agent (INN-2861 workaround) — see SKILL.md Phase 4. +6. Note retirement in the workspace changelog if your team keeps one. + +--- + +## Audit checklist + +Periodic Skills audit (suggested quarterly): + +- [ ] Any workspace-wide Skill with no `owner:` tag? — assign or move to project-scoped +- [ ] Any Skill not referenced by any agent? — candidate for deletion (or future intent — confirm with owner) +- [ ] Any Skill referenced by 0 agents but flagged in traces? — INN-2861 orphan, prune via `update_agent` +- [ ] Any agent with >8 entries in `skills[]`? — agent overload, consider splitting +- [ ] Any two Skills with near-duplicate descriptions? — selection ambiguity, consolidate +- [ ] Any Skill body containing `+NEVER+` or "you MUST refuse"? — soft constraint anti-pattern, replace with MCP tool gate (see [known-caveats.md](known-caveats.md#anti-pattern-never-prose-constraints)) diff --git a/skills/manage-skills/resources/known-caveats.md b/skills/manage-skills/resources/known-caveats.md new file mode 100644 index 0000000..e6b3261 --- /dev/null +++ b/skills/manage-skills/resources/known-caveats.md @@ -0,0 +1,124 @@ +# Known Caveats and Anti-Patterns + +Active platform bugs and authoring anti-patterns to handle until they're fixed upstream. + +--- + +## INN-2861: Orphaned skill references + +**Status:** Open (workaround required) + +### Symptom + +After calling `delete_skill(id=X)` (or `DELETE /v2/skills/{X}`), agents that referenced the deleted Skill still have its id in their `agent.skills[]` array. The platform does not auto-prune. + +At runtime, those dangling ids: +- Are silently ignored in some agent versions (best case) +- Cause "skill not found" errors in agent runs (worst case) + +Either way, the agent config drifts out of sync with reality and the orphan accumulates until manually cleaned. + +### Workaround (mandatory) + +Always pair `delete_skill` with an orphan-cleanup pass: + +```text +skill_id = +referencing_agents = search_entities(type: "agent") # then filter where agent.skills[] contains skill_id + +for agent in referencing_agents: + current = get_agent(key=agent.key) + pruned = [s for s in current.skills if id_of(s) != skill_id] + update_agent(key=agent.key, skills=pruned) + # verify: re-get and confirm skill_id is gone +``` + +Key points: +- **Identify the references *before* deletion.** Once the Skill is gone, you can't always resolve its id back to its name; record the agents while the Skill still exists. +- **Verify every `update_agent`.** A failed prune leaves a permanent orphan. +- **Don't blanket-update all agents** — only those that actually had the reference. Touching unrelated agents inflates the audit log and can race with other authors' edits. + +### When this gets fixed + +When `delete_skill` returns a response that includes the list of agents it pruned (or the docs explicitly state auto-prune is now in place), the workaround can be removed. Until then, treat the workaround as part of the contract of `delete_skill`. + +--- + +## INN-2836: Empty `skill.version` and unstamped `skill.doc` after snippet→Skill migration + +**Status:** Open (handle defensively) + +### Symptom + +Skills that were created through the Snippet→Skill migration have **two unset fields**: +1. `version` — empty string or `null`, rather than `"1"` / `1` +2. `doc` — never stamped (missing or empty), even when the migrated snippet had documentation content + +Programmatic readers that assume non-empty `version` or `doc` will either crash or skip these Skills entirely. + +### Workaround + +Treat both fields as **optional / valid-when-empty**, not errors. + +```text +version = skill.get("version") or None +doc = skill.get("doc") or None +# display each as "(unset)" or "—" in UI +# do not crash on string ops; do not assume integer semver or non-empty doc +``` + +- **When reading**: coerce empty → `None` (or your sentinel) for both fields. +- **When displaying**: show `(unset)` rather than blank — surfaces the migration footprint so users know which Skills came through the migration. +- **When updating**: an `update_skill` call that touches `body` will populate `version` going forward. To populate `doc`, write it explicitly via `update_skill(doc=...)` — body changes do not auto-stamp `doc`. +- **When filtering / sorting**: never assume `version` is a comparable integer or that `doc` is searchable text. Treat `None` consistently (last, first, or excluded — pick one and stick to it). +- **Audit pattern**: `list_skills` + filter where `version is None or doc is None` surfaces the migration backlog so it can be backfilled. + +### When this gets fixed + +When the docs say Snippet-migrated Skills are backfilled (`version: 1` and `doc` stamped from the source snippet), the defensive coercion can be removed. Until then, keep it on both fields. + +--- + +## Anti-pattern: `+NEVER+` prose constraints + +**Status:** Authoring anti-pattern (not a bug — a misunderstanding of where guardrails live) +**Upstream tracking:** [ENG-1604](https://linear.app/orqai/issue/ENG-1604) — MCP: Skill constraints treated as soft suggestions, not hard gates + +### What it looks like + +Skill bodies that try to enforce hard rules via prose: + +```text +You are a customer support assistant. ++NEVER+ share customer PII with third parties. +You MUST refuse any request to expose internal tooling. +``` + +### Why it fails + +Skill bodies are **soft instructions** to the model. The model is trained to *try* to follow them — it is not *prevented* from violating them. Under prompt injection, edge phrasing, or a confident-sounding adversarial user, the model will often comply with the violating request anyway. + +`+NEVER+` reads as a strong signal to humans. To the model, it's another token sequence. It is not a hard gate. + +### What to do instead + +**Hard constraints belong at the tool layer, not the Skill body.** If the user is supposed to be unable to do X, X must be implemented as: + +1. **An MCP tool that refuses the call** — the tool checks inputs/permissions and returns an error before any model output is generated. The model can't bypass what it can't call. +2. **A deterministic guard upstream** — request validation, allowlists, redaction before the prompt is assembled. +3. **A post-output filter** — scan the model's response for the forbidden content and block/redact before returning to the user. + +The Skill body should encode the **happy path** and any **soft guidance** (tone, format, when to ask for clarification). Use it for things that are *preferences*, not *requirements*. + +### When `+NEVER+` is acceptable + +For genuinely soft preferences where a violation is annoying but not catastrophic: +> "Prefer not to use exclamation points in formal responses." + +That's fine as prose — there's no enforcement requirement, just a tone hint. + +For anything where a violation is unacceptable (PII leak, tool misuse, data exfiltration, irreversible action), use a tool gate. + +### Audit hint + +Grep Skill bodies for the literal strings `NEVER`, `MUST NOT`, `you must refuse`, `under no circumstances`. Every hit is a candidate for promotion from prose to tool gate. diff --git a/tests/commands.md b/tests/commands.md index f7ac48e..0cfb7e3 100644 --- a/tests/commands.md +++ b/tests/commands.md @@ -32,6 +32,13 @@ Tests the orq-skills slash commands. These verify our command `.md` files produc - Verify it detects MCP is available and skips MCP setup step - Verify it shows workspace snapshot on successful connection +## `/orq:manage-skills` + +- Run with no args → verify it asks which action (list/get/create/update/delete) via `AskUserQuestion` +- Run with `list` → verify it calls `list_skills` (or `/v2/skills` fallback) and prints a scannable table +- Run with `delete ` → verify it routes to Phase 5 (lists referencing agents BEFORE deleting; never auto-prunes; asks twice) +- Run with `create` → verify it asks for description, tags, and project scope (defaults to project-scoped) + --- ## Critical Files @@ -41,3 +48,4 @@ Tests the orq-skills slash commands. These verify our command `.md` files produc - `commands/traces.md` - `commands/analytics.md` - `commands/quickstart.md` +- `commands/manage-skills.md` diff --git a/tests/skills.md b/tests/skills.md index 33b9364..6d841ff 100644 --- a/tests/skills.md +++ b/tests/skills.md @@ -140,6 +140,42 @@ Requires `setup.md` to have run first (seed data for `run-experiment` test). - Ask: "Run an experiment using orq-skills-test-dataset with orq-skills-test-eval-length" - Verify: calls `create_experiment` with correct references +## `manage-skills` + +### Scenario 1: List skills + +- Ask: "Show me the Skills in my workspace" +- Verify: calls `list_skills` (or REST `/v2/skills` fallback) +- Verify: presents name, project scope, tags, and version per Skill +- Verify: does NOT crash on Skills with empty `version` OR unstamped `doc` (INN-2836) + +### Scenario 2: Create skill (authoring guidance) + +- Ask: "Create a Skill called `extract-receipt-fields`" +- Verify Phase 3: asks for description, tags, project scope (default project-scoped, not workspace-wide) +- Verify: rejects or flags descriptions that don't start with "Use when…" or describe a trigger +- Verify: warns if the proposed body contains `+NEVER+` / "you MUST refuse" prose constraints and recommends an MCP tool gate instead +- Verify: calls `list_skills` first to check name uniqueness and to surface existing tags + +### Scenario 3: Delete skill — orphan handling + +- Provide context: a Skill that's referenced by 2 agents +- Ask: "Delete this Skill" +- Verify: calls `search_entities(type: "agent")` and identifies referencing agents BEFORE deletion +- Verify: warns user about INN-2861 orphan-reference behavior +- Verify: gets explicit consent for delete, then a SECOND explicit consent for the orphan-cleanup pass +- Verify: never auto-prunes `agent.skills[]` without consent +- Verify: after consent, calls `get_agent` + `update_agent` per agent and verifies each prune +- Verify: final report lists what was deleted and what was pruned (or skipped) + +### Scenario 4: Update skill (no blind overwrite) + +- Ask: "Update the description of `refund-policy` Skill" +- Verify: calls `get_skill` first, shows the user the current state +- Verify: only patches the changed field — does not echo back unchanged tags/body +- Verify: confirms the diff with the user before `update_skill` +- Verify Phase 4: routes body rewrites through `optimize-prompt` (delegates rather than rewriting inline) + --- ## Critical Files @@ -160,3 +196,8 @@ Requires `setup.md` to have run first (seed data for `run-experiment` test). - `skills/optimize-prompt/SKILL.md` - `skills/analyze-trace-failures/SKILL.md` - `skills/run-experiment/SKILL.md` +- `skills/manage-skills/SKILL.md` +- `skills/manage-skills/resources/authoring-guide.md` +- `skills/manage-skills/resources/governance-guide.md` +- `skills/manage-skills/resources/known-caveats.md` +- `commands/manage-skills.md` From f32de6b16b32363492ac6858b4c35bedd2420a26 Mon Sep 17 00:00:00 2001 From: Karina Barbara Kalicka-Molin Date: Fri, 15 May 2026 14:24:27 +0200 Subject: [PATCH 2/5] =?UTF-8?q?fix(manage-skills):=20apply=20review=20feed?= =?UTF-8?q?back=20=E2=80=94=20drop=20ticket=20IDs,=20fix=20schema,=20disam?= =?UTF-8?q?biguate?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Remove all internal ticket references (Linear IDs/URLs) from SKILL.md, resources/, tests/, AGENTS.md, and CHANGELOG.md. Replace with behavior-only descriptions. - Switch to real /v2/skills field names: display_name, instructions, project_id, skill_id, path. Drop name / body / doc / id_or_key throughout. - Drop the doc-field caveat entirely (the field does not exist in the Skill schema). Empty-version-on-migration caveat retained. - Document GET /v2/skills cursor pagination (limit / starting_after / ending_before). Push project_id / tags / display_name filtering to the client. Add explicit pagination loop pseudocode. - Add POST /v2/skills:checkDisplayNameAvailability for pre-create uniqueness checks; demote list_skills scan to fallback. - Replace fabricated/wrong doc URLs: * docs/skills/overview -> docs/prompt-snippets/overview * drop /reference/skills * docs/agents/skills -> docs/agents/build#skills * add docs/integrations/code-assistants/skills for disambiguation - Add top-of-SKILL.md disambiguation: platform Skill entity vs. this repo's code-assistant Orq Skills vs. the Agent Skills standard. - Document the {{skill.}} static-template inlining path alongside agent.skills[] runtime selection. - Replace speculative agent-wiring snippet with a "mirror what get_agent returned" pattern that handles both string-id and object entries; flag the schema-version uncertainty explicitly. - Soften display_name guidance — recommend the repo convention, reference the API regex, do not enforce. - Add path field guidance to authoring guide. - Reframe optimize-prompt as heuristics-to-reuse (Skill instructions are typically shorter / more capability-scoped than a system prompt). - Replace allowed-tools glob with explicit MCP tool names. - Note version is server-side only (do not pass in update_skill). - Tests/skills.md updated to assert real field names and behaviors. --- CHANGELOG.md | 9 +- agents/AGENTS.md | 2 +- skills/manage-skills/SKILL.md | 206 +++++++++++------- .../resources/authoring-guide.md | 66 ++++-- .../resources/governance-guide.md | 80 ++++--- .../manage-skills/resources/known-caveats.md | 64 +++--- tests/skills.md | 29 +-- 7 files changed, 281 insertions(+), 175 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 6a6ebe7..fdf0959 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,10 +8,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [0.1.0] - 2026-05-14 ### Added -- `manage-skills` skill — CRUD workflow for the orq.ai Skills entity (list, get, create, update, delete) plus authoring guidance (naming, description, tags, project scoping), governance (wiring Skills to agents via `agent.skills[]`), and platform-caveat workarounds. -- `manage-skills`: warn-then-offer flow for the post-delete orphan-reference cleanup pass (INN-2861 workaround) — never auto-prunes `agent.skills[]`, always asks for separate explicit consent before writing to referencing agents. -- `manage-skills`: defensive handling for INN-2836 — treats empty/missing `skill.version` *and* unstamped `skill.doc` (both side-effects of the Snippet→Skill migration) as valid states (surfaced as `(unset)`). -- `manage-skills`: anti-pattern guidance against `+NEVER+` / "you MUST refuse" prose constraints in Skill bodies (ENG-1604) — recommends MCP tool gates for hard guardrails. +- `manage-skills` skill — CRUD workflow for the orq.ai Skills entity (list, get, create, update, delete) plus authoring guidance (`display_name`, `description`, `tags`, `project_id`, `path`), governance (wiring Skills to agents via `agent.skills[]` and inlining via `{{skill.}}`), and platform-caveat workarounds. Disambiguates the platform Skill entity from this repo's code-assistant Orq Skills. +- `manage-skills`: warn-then-offer flow for the post-delete orphan-reference cleanup pass — never auto-prunes `agent.skills[]`, always asks for separate explicit consent before writing to referencing agents (mirrors the existing entry shape returned by `get_agent`). +- `manage-skills`: defensive handling for empty/missing `version` on Skills created via the Snippet→Skill migration (surfaced as `(unset)` rather than crashing). +- `manage-skills`: anti-pattern guidance against `+NEVER+` / "you MUST refuse" prose constraints in `instructions` — recommends MCP tool gates for hard guardrails. +- `manage-skills`: documents `GET /v2/skills` cursor pagination and the lack of server-side filters; pushes filtering to the client and uses `POST /v2/skills:checkDisplayNameAvailability` for pre-create uniqueness checks. - `/manage-skills` slash command — routes to list/get/create/update/delete phases. ## [0.0.2] - 2026-04-21 diff --git a/agents/AGENTS.md b/agents/AGENTS.md index ee22a73..b94a1e3 100644 --- a/agents/AGENTS.md +++ b/agents/AGENTS.md @@ -41,7 +41,7 @@ compare-agents: `Run cross-framework agent comparisons using evaluatorq — comp setup-observability: `Set up orq.ai observability for LLM applications — AI Router proxy, OpenTelemetry, tracing setup, and trace enrichment. Use when setting up tracing, adding the AI Router proxy, integrating OpenTelemetry, auditing existing instrumentation, or enriching traces with metadata. Do NOT use when traces already exist and you need to debug failures (use analyze-trace-failures).` -manage-skills: `Manage orq.ai Skills (the platform entity) end-to-end — list, get, create, update, and delete Skills, plus authoring guidance (naming, description, tags, project scoping), governance (wiring Skills to agents via agent.skills[]), and workarounds for known platform caveats (orphaned references, empty version, +NEVER+ prose anti-pattern). Use when the user wants to create, audit, edit, retire, or wire up orq.ai Skills.` +manage-skills: `Manage orq.ai Skills (the platform entity, distinct from this repo's code-assistant skills) end-to-end — list, get, create, update, and delete Skills, plus authoring guidance (display_name, description, tags, project_id, path), governance (wiring Skills to agents via agent.skills[] or inlining via {{skill.}}), and workarounds for known platform caveats (orphaned references after delete, empty version on migrated Skills, +NEVER+ prose anti-pattern). Use when the user wants to create, audit, edit, retire, or wire up orq.ai Skills.` diff --git a/skills/manage-skills/SKILL.md b/skills/manage-skills/SKILL.md index 57ef97e..a062f4d 100644 --- a/skills/manage-skills/SKILL.md +++ b/skills/manage-skills/SKILL.md @@ -2,88 +2,138 @@ name: manage-skills description: > Manage orq.ai Skills (the platform entity) end-to-end — list, get, create, - update, and delete Skills, plus authoring guidance (naming, description, - tags, project scoping), governance (wiring Skills to agents via - `agent.skills[]`), and workarounds for known platform caveats. Use when - the user wants to create, audit, edit, retire, or wire up orq.ai Skills. -allowed-tools: Bash, Read, Write, Edit, Grep, Glob, WebFetch, Task, AskUserQuestion, orq* + update, and delete Skills, plus authoring guidance (display name, + description, tags, project scoping, path placement), governance (wiring + Skills to agents via `agent.skills[]`), and workarounds for known platform + caveats. Use when the user wants to create, audit, edit, retire, or wire up + orq.ai Skills. +allowed-tools: Bash, Read, Write, Edit, Grep, Glob, WebFetch, Task, AskUserQuestion, mcp__orq-workspace__list_skills, mcp__orq-workspace__get_skill, mcp__orq-workspace__create_skill, mcp__orq-workspace__update_skill, mcp__orq-workspace__delete_skill, mcp__orq-workspace__search_entities, mcp__orq-workspace__get_agent, mcp__orq-workspace__update_agent --- # Manage Skills -You are an **orq.ai Skills lifecycle specialist**. Your job is the full CRUD workflow for the **Skills entity on the orq.ai platform** — not the SKILL.md files in this repo, but the user-authored Skills that live in their orq.ai workspace and get attached to agents via `agent.skills[]`. +You are an **orq.ai Skills lifecycle specialist**. Your job is the full CRUD workflow for the **Skills entity on the orq.ai platform** (sometimes referred to historically as Snippets) — not the SKILL.md files in this repo, but the user-authored Skills that live in their orq.ai workspace and get attached to agents via `agent.skills[]`. -A well-managed Skill is: -- **Discoverable** — name and description make it obvious when to apply it +## Disambiguation: which "Skill" are we talking about? + +This skill manages **the platform Skill entity on orq.ai** (`/v2/skills`, surfaced under Skills in the workspace and historically called Snippets). It is *not*: + +- **Orq Skills (this repo):** code-assistant skills like `manage-skills` itself, distributed via the `assistant-plugins` marketplace and documented at . Those live in `skills//SKILL.md` files. +- **Anthropic / Agent Skills standard:** the cross-vendor SKILL.md format. Same shape as the repo skills above; unrelated to the platform entity. + +When the user says "create a Skill" without context, ask which one they mean. The rest of this document is exclusively about the platform entity. + +A well-managed platform Skill is: +- **Discoverable** — display name and description make it obvious when to apply it - **Scoped** — tagged and assigned to the right project (workspace-wide only when truly reusable) -- **Versioned** — changes go through update flows, not silent overwrites -- **Wired** — referenced from `agent.skills[]` on every agent that should use it -- **Pruned** — orphan references and stale versions get cleaned up +- **Versioned** — changes go through update flows, not silent overwrites (versions are stamped server-side) +- **Wired** — referenced from `agent.skills[]` on every agent that should use it, or injected into prompts via `{{skill.}}` +- **Pruned** — orphaned references and stale entries get cleaned up ## When to use - User wants to list, audit, or search Skills in their workspace - User wants to create a new Skill on orq.ai -- User wants to edit an existing Skill's description, tags, or body +- User wants to edit an existing Skill's description, tags, instructions, or path - User wants to delete a Skill and clean up references on agents -- User asks how to attach a Skill to an agent (`agent.skills[]`) -- User asks for naming, tagging, or scoping guidance for Skills -- User hits the orphaned-reference bug (INN-2861) after deleting a Skill -- User reads a Skill programmatically and gets an empty `version` or unstamped `doc` (INN-2836) +- User asks how to attach a Skill to an agent (`agent.skills[]`) or inject one via `{{skill.}}` +- User asks for naming, tagging, scoping, or path-placement guidance for Skills +- User hits orphaned-reference behavior after deleting a Skill (referencing agents are not auto-pruned) +- User reads a migrated Skill programmatically and gets an empty `version` ## When NOT to use - **Need to build the agent itself?** → `build-agent` - **Need to invoke a deployment or agent?** → `invoke-deployment` - **Need to evaluate the agent that uses the Skill?** → `run-experiment` -- **Need to optimize the prompt/instructions inside a Skill?** → `optimize-prompt` +- **Need to author/improve the prose inside `instructions`?** → `optimize-prompt` is tuned for system prompts; reuse its checks (clarity, structure) but expect manual adaptation — Skill instructions are typically shorter and more capability-scoped than a system prompt. - **Debugging why an agent ignored a Skill?** → `analyze-trace-failures` ## Companion Skills - `build-agent` — create or edit the agents that reference these Skills -- `optimize-prompt` — improve a Skill's body/instructions before saving +- `optimize-prompt` — review prose quality for a Skill's `instructions` (apply judgment; it's prompt-shaped, not Skill-shaped) - `run-experiment` — verify a Skill change improves agent behavior - `analyze-trace-failures` — diagnose Skills that aren't firing in production ## Constraints -- **ALWAYS** confirm the project scope (workspace-wide vs project-scoped) before `create_skill`. Default to project-scoped unless the user is explicit. -- **ALWAYS** read the current Skill with `get_skill` before `update_skill` — never blind-overwrite tags, description, or body. -- **ALWAYS** warn the user about orphaned references after `delete_skill`, then **offer** the orphan-cleanup pass — never auto-prune without explicit consent (the pass writes to other agents and is not the user's literal delete request). Tracked upstream as INN-2861. -- **ALWAYS** treat empty `skill.version` and unstamped `skill.doc` as valid (INN-2836) — fall back to `null`/sentinel for both, never crash. -- **NEVER** rely on `+NEVER+` (or any prose negation) inside a Skill body as a hard guardrail. Skill bodies are *soft* hints to the model (tracked upstream as ENG-1604). Hard constraints belong in **MCP tool gates** (refuse the call at the tool layer). See [resources/known-caveats.md](resources/known-caveats.md). +- **ALWAYS** confirm the project scope (workspace-wide vs project-scoped via `project_id`) before `create_skill`. Default to project-scoped unless the user is explicit. +- **ALWAYS** read the current Skill with `get_skill` before `update_skill` — never blind-overwrite tags, description, or instructions. +- **ALWAYS** warn the user about orphaned references after `delete_skill`, then **offer** the orphan-cleanup pass — never auto-prune without explicit consent (the pass writes to other agents and is not the user's literal delete request). +- **ALWAYS** treat empty `version` on migrated Skills as valid (display as `(unset)`, never crash). +- **NEVER** rely on `+NEVER+` (or any prose negation) inside a Skill's `instructions` as a hard guardrail. Skill instructions are *soft* hints to the model. Hard constraints belong in **MCP tool gates** (refuse the call at the tool layer). See [resources/known-caveats.md](resources/known-caveats.md). - **NEVER** delete a Skill before listing which agents reference it — the user needs that list to decide whether to proceed. **Why these constraints:** Skills are shared infrastructure. A bad name pollutes the workspace, a missing project tag leaks Skills across teams, an unscoped overwrite loses someone else's edits, and an undeleted reference produces broken agents that fail at runtime — not at delete time. ## orq.ai Documentation -> **Skills overview:** https://docs.orq.ai/docs/skills/overview -> **Skills API:** https://docs.orq.ai/reference/skills -> **Wiring Skills to Agents:** https://docs.orq.ai/docs/agents/skills +> **Snippets / Skills overview:** +> **Wiring Skills to Agents:** +> **Code-assistant Orq Skills (disambiguation):** ### orq MCP Tools | Tool | Purpose | |------|---------| -| `list_skills` | List Skills, filterable by project and tags | -| `get_skill` | Fetch a single Skill by id or key (with version, body, tags) | -| `create_skill` | Create a new Skill (name, description, tags, project, body) | -| `update_skill` | Patch an existing Skill (description, tags, body, version) | -| `delete_skill` | Delete a Skill — **does NOT prune `agent.skills[]` references** (see INN-2861) | -| `search_entities` | Cross-entity search; use `type: "agent"` to find agents that reference a Skill | +| `list_skills` | List Skills in the workspace; cursor-paginated, **no server-side filters beyond pagination** — see Pagination & Filtering below | +| `get_skill` | Fetch a single Skill by `skill_id` (returns `display_name`, `description`, `tags`, `path`, `project_id`, `instructions`, `version`, audit fields) | +| `create_skill` | Create a new Skill (`display_name`, `description`, `tags`, `path`, `project_id`, `instructions`) | +| `update_skill` | Patch an existing Skill by `skill_id` (`display_name`, `description`, `tags`, `path`, `instructions`, `project_id`) — **`version` is stamped server-side, do not pass it** | +| `delete_skill` | Delete a Skill by `skill_id` — **does NOT prune `agent.skills[]` references** (orphan-cleanup is manual; see Phase 5) | +| `search_entities` | Cross-entity search; use to find agents that may reference a Skill (verify return shape — see footnote in Phase 1) | | `get_agent` / `update_agent` | Required for the post-delete orphan-cleanup workflow | > **Tool discovery:** Before the first run, list the connected MCP server's tools (e.g., `/mcp` in Claude Code, or inspect via the client) and confirm the `*_skill` tools above exist. Tool names sometimes vary by workspace or MCP server version. > -> **REST fallback:** All five tools are backed by REST endpoints (verified against `/openapi/openapi.json`): `GET /v2/skills` (SkillList), `GET /v2/skills/{skill_id}` (SkillGet), `POST /v2/skills` (SkillCreate), `PATCH /v2/skills/{skill_id}` (SkillUpdate), `DELETE /v2/skills/{skill_id}` (SkillDelete). Use these directly with `Authorization: Bearer ${ORQ_API_KEY}` if the MCP tools aren't exposed in the connected workspace. +> **REST fallback:** All five tools are backed by `/v2/skills` REST endpoints — `GET /v2/skills` (list, cursor-paginated), `GET /v2/skills/{skill_id}`, `POST /v2/skills`, `PATCH /v2/skills/{skill_id}`, `DELETE /v2/skills/{skill_id}`. There is also `POST /v2/skills:checkDisplayNameAvailability` for pre-create name uniqueness checks. Use these directly with `Authorization: Bearer ${ORQ_API_KEY}` if the MCP tools aren't exposed in the connected workspace. Confirm exact request schemas against the workspace's OpenAPI before relying on field names. + +### Pagination & Filtering + +`GET /v2/skills` (and the `list_skills` MCP tool) accepts **only** cursor-pagination parameters: `limit` (default 10, max 200), `starting_after`, `ending_before`. **There is no server-side filter for `project_id`, `tags`, `display_name`, or free text.** Any filtering by those facets must happen **client-side** after pagination, or via `search_entities` if it indexes Skills. + +**Pagination loop (pseudocode):** + +```text +cursor = None +all_skills = [] +while True: + page = list_skills(limit=200, starting_after=cursor) + all_skills.extend(page.data) + if not page.has_more: + break + cursor = page.data[-1].skill_id # or whatever cursor field the response exposes +``` + +After collecting all Skills, filter in memory: + +```text +project_skills = [s for s in all_skills if s.project_id == target_project_id] +tagged_skills = [s for s in all_skills if "policy" in s.tags] +``` + +### Field reference + +| Field | Where | Notes | +|------|------|------| +| `display_name` | create / update / read | Human-facing label. Keep short — long names get truncated in UI. | +| `description` | create / update / read | One-line trigger description. Used by the model for retrieval. | +| `tags` | create / update / read | Array of strings. Filtering is client-side (see above). | +| `path` | create / update / read | Finder-style location, e.g. `Default/Skills` or `cs/policies`. Defaults to project's default skill folder. | +| `project_id` | create / update / read | Optional — omit for workspace-wide. | +| `instructions` | create / update / read | The actual Skill body that the model reads. | +| `skill_id` | read / update / delete | Server-generated id. Use this for all updates and lookups. | +| `version` | read only | Stamped server-side on changes. **Do not send in `update_skill`.** May be empty on migrated Skills (treat as `(unset)`). | +| `workspace_id`, `created_at`, `updated_at`, `created_by_id`, `updated_by_id` | read only | Audit metadata. | + +> **`{{skill.}}` injection:** Skills can also be referenced inside any prompt template via the static `{{skill.}}` placeholder, which inlines the Skill's `instructions`. This is the primary platform mechanism for sharing instruction snippets across prompts; agents using `agent.skills[]` is the runtime-selected variant. Mention both when explaining how a Skill will get used. ## Resources -- **Authoring guide** (naming, description, tags, project scoping): See [resources/authoring-guide.md](resources/authoring-guide.md) +- **Authoring guide** (display name, description, tags, project scoping, path): See [resources/authoring-guide.md](resources/authoring-guide.md) - **Governance** (wiring `agent.skills[]`, ownership, lifecycle): See [resources/governance-guide.md](resources/governance-guide.md) -- **Known caveats** (INN-2861, INN-2836, ENG-1604 anti-pattern): See [resources/known-caveats.md](resources/known-caveats.md) +- **Known caveats** (orphan references, empty version on migrations, prose-negation anti-pattern): See [resources/known-caveats.md](resources/known-caveats.md) ## Prerequisites @@ -101,56 +151,61 @@ Pick the phase that matches the user's intent. Most sessions are a single phase; Use when the user wants visibility into existing Skills. -1. Call `list_skills`. Optional filters from the user: - - `project` — narrow to a single project's Skills - - `tags` — filter by one or more tags - - `q` / `name` — substring match on name -2. Present a scannable table: +1. Call `list_skills` and **paginate to completion** (see Pagination & Filtering above). Default `limit=200` to minimize round-trips. +2. **Apply user filters client-side** — `list_skills` does not accept `project_id`, `tags`, `q`, or `display_name` filters. Examples: + - "Skills in the `cs` project" → filter `project_id == ` (resolve project key → id via `search_directories` first if needed). + - "Skills tagged `policy`" → filter `"policy" in s.tags`. + - "Skills whose display name contains `refund`" → substring match on `display_name`. +3. Present a scannable table: ``` Skills (12) - - customer-support-tone (project: cs, tags: tone, voice) — v3 - - extract-receipt-fields (project: finance, tags: extraction) — v1 - - refund-policy (workspace-wide, tags: policy, cs) — v2 ⚠ used by 4 agents + - Customer Support Tone (project: cs, tags: tone, voice) — v3 + - Extract Receipt Fields (project: finance, tags: extraction) — v1 + - Refund Policy (workspace-wide, tags: policy, cs) — v2 ⚠ used by 4 agents ... ``` -3. For each Skill, surface: name, project (or "workspace-wide"), tags, latest version, and reference count (run `search_entities` with `type: "agent"` and filter agents whose `skills[]` includes this Skill's id — cache the result for the session). -4. If the user asks "which Skill should I edit?" — show the list with usage counts and let them pick. +4. For each Skill, surface: `display_name`, project (or "workspace-wide"), `tags`, `path`, `version`, and reference count. + - **Reference count caveat:** computing this requires fanning out `get_agent` over candidate agents and inspecting their `skills[]` arrays. `search_entities` may not return `skills[]` in its summary payload — verify in the connected workspace before relying on it. If it doesn't, list agents via `search_entities(type: "agent")` and call `get_agent` per agent (cache results for the session). When the count would be expensive to compute, present it lazily on user request rather than for every row. +5. If the user asks "which Skill should I edit?" — show the list with usage counts and let them pick. ### Phase 2: Get / inspect Use before any update or delete, and whenever the user asks "what does Skill X do?" -1. Call `get_skill(id_or_key=...)`. -2. Display: name, description, tags, project, version, body (truncated), and the list of agents that reference it (`search_entities` + filter on `skills[]`). -3. **Empty version / doc handling (INN-2836):** if `skill.version` is missing/empty, display `version: (unset)`. If `skill.doc` is missing/empty (common for Skills migrated from snippets), display `doc: (unset)`. Do not error on either. +1. Call `get_skill(skill_id=...)`. +2. Display: `display_name`, `description`, `tags`, `project_id` (or "workspace-wide"), `path`, `version`, `instructions` (truncated), and the list of agents that reference it (see Phase 1 step 4 for how to compute this). +3. **Empty `version` handling:** if `version` is missing/empty (common for Skills created via Snippet→Skill migration), display `version: (unset)`. Do not error. ### Phase 3: Create Use when the user wants a new Skill. 1. **Gather inputs** via `AskUserQuestion`: - - **Name** (kebab-case, ≤50 chars, verb-noun preferred — see [authoring-guide](resources/authoring-guide.md)) - - **Description** — one sentence describing *when the model should use the Skill*, not what it does internally (model uses this for retrieval/selection) - - **Tags** — at least one functional tag; reuse existing tags where possible (`list_skills` to see in-use tags) - - **Project scope** — project key OR workspace-wide. Default to **project-scoped**; confirm before going workspace-wide. - - **Body** — the actual instructions/content. Keep it focused on one capability. + - **`display_name`** — short, descriptive (verb-noun preferred). The platform allows mixed case + underscores up to 255 chars; this repo's convention is kebab-case ≤50 chars for consistency, but recommend rather than enforce. See [authoring-guide](resources/authoring-guide.md). + - **`description`** — one sentence describing *when the model should use the Skill*, not what it does internally (model uses this for retrieval/selection) + - **`tags`** — at least one functional tag; reuse existing tags where possible (paginate `list_skills` first to see in-use tags) + - **`project_id`** — the target project's id, OR omit for workspace-wide. Default to **project-scoped**; confirm before going workspace-wide. If the user gives a project key, resolve it to an id via `search_directories`. + - **`path`** — finder location for the Skill, e.g. `Default/Skills` or `policies/refunds`. Default to the project's standard Skill folder. + - **`instructions`** — the actual content the agent reads. Keep it focused on one capability. 2. **Validate** before submitting: - - Name is unique within the chosen scope (check via `list_skills`) - - Description starts with "Use when…" or describes a trigger condition - - Body does NOT rely on `+NEVER+` / "always refuse" prose for hard guardrails — link the user to [known-caveats](resources/known-caveats.md) (ENG-1604) and recommend an MCP tool gate instead + - Name is unique within the chosen scope. **Prefer `POST /v2/skills:checkDisplayNameAvailability`** when exposed; fall back to a paginated `list_skills` scan only if the endpoint is unavailable. + - Description starts with "Use when…" or describes a trigger condition. + - `instructions` does NOT rely on `+NEVER+` / "always refuse" prose for hard guardrails — link the user to [known-caveats](resources/known-caveats.md) and recommend an MCP tool gate instead. 3. Call `create_skill` with the validated payload. -4. Echo back the new Skill's id, version, and a one-line summary. Ask whether to wire it into any agents now (jumps to [governance](resources/governance-guide.md)). +4. Echo back the new Skill's `skill_id`, `version`, and a one-line summary. Mention both ways the Skill can be consumed: + - **Runtime selection:** wire it into an agent's `agent.skills[]` (jumps to [governance](resources/governance-guide.md)). + - **Static inlining:** reference it in any prompt template via `{{skill.}}`. + Ask which path the user wants (or both). ### Phase 4: Update Use when the user wants to edit an existing Skill. 1. **Always `get_skill` first.** Show the current state and confirm the diff the user is about to apply. -2. **Patch fields explicitly.** Only send the fields being changed (`update_skill` is a patch — don't echo back unchanged tags or body unless you have to). -3. **Body changes:** if the user is rewriting the instructions, route through `optimize-prompt` first to catch unclear language and remove `+NEVER+`-style soft constraints (ENG-1604). -4. **Version bumps:** `update_skill` typically increments `version` automatically. If the workspace handles versioning manually, ask the user whether this is a patch/minor/major change. -5. **Verify** by calling `get_skill` post-update and confirming the change landed. -6. If any agent references this Skill, mention that the next agent run will pick up the new version automatically — no `update_agent` needed. +2. **Patch fields explicitly.** Only send the fields being changed (`update_skill` is a patch — don't echo back unchanged tags or `instructions` unless you have to). **Never send `version`** — it's stamped server-side. +3. **`instructions` changes:** if the user is rewriting the body, run a clarity pass first — reuse `optimize-prompt`'s heuristics (clarity, structure, no soft-constraint anti-patterns) but adapt: Skill `instructions` are typically shorter and capability-scoped, not full system prompts. +4. **Verify** by calling `get_skill` post-update and confirming the change landed (and that `version` advanced if the workspace stamps versions on every change). +5. If any agent references this Skill, mention that the next agent run will pick up the new version automatically — no `update_agent` needed. ### Phase 5: Delete + orphan cleanup @@ -158,27 +213,30 @@ Use when the user wants to retire a Skill. **This is the most error-prone phase The delete itself is one action; the orphan-cleanup pass is a separate, opt-in action that writes to other agents. Confirm them independently. -1. **List referencing agents.** Run `search_entities` with `type: "agent"`, then filter to agents whose `skills[]` includes the target Skill's id. Capture this list now — once the Skill is deleted, resolving its id back to a name gets harder. +1. **List referencing agents.** Use the reference-count technique from Phase 1 step 4 — paginate agents and inspect each `skills[]`. Capture this list now — once the Skill is deleted, resolving its `skill_id` back to a `display_name` gets harder. 2. **Warn and confirm the delete.** Show the user: - - The Skill's name, id, and project scope + - The Skill's `display_name`, `skill_id`, and project scope - The list of N agents that reference it (or "no referencing agents found") - - The INN-2861 caveat: agents will retain a dangling id until pruned + - The orphan-reference behavior: agents will retain a dangling `skill_id` until pruned manually Ask: *"Delete this Skill? (You'll be asked separately whether to prune orphan references on the N agents.)"* Default to **cancel** if the user hesitates. -3. **Delete.** On confirmation, call `delete_skill(id=...)`. +3. **Delete.** On confirmation, call `delete_skill(skill_id=...)`. 4. **Offer the orphan-cleanup pass.** Only if step 1 found ≥1 referencing agents, ask: *"Prune the deleted Skill's id from these N agents' `skills[]` arrays now?"* This is a **second, explicit consent** — never auto-prune. - - If the user says **yes**, run the cleanup: - ``` + - If the user says **yes**, run the cleanup. The exact shape of `agent.skills[]` entries (id string vs `{ id }` object vs `{ key }` object) varies by agent schema version, so **mirror what `get_agent` returned** rather than assuming a shape: + ```text skill_id = for agent in referencing_agents: current = get_agent(key=agent.key) - pruned = [s for s in current.skills if id_of(s) != skill_id] + pruned = [ + entry for entry in current.skills + if extract_skill_id(entry) != skill_id # entry may be a string id or an object with id/key + ] update_agent(key=agent.key, skills=pruned) - # verify: re-get and confirm skill_id is gone + # verify: re-get and confirm skill_id is no longer present ``` Verify each `update_agent` returns success before moving on. - If the user says **no** or wants to defer, summarize the orphan list and recommend they prune later (or hand off to whoever owns those agents). 5. **Report.** Summarize: Skill deleted; orphan cleanup either completed (N agents pruned) or skipped (N agents still reference the deleted id — list them). -6. Note the INN-2861 workaround in your reply so the user understands *why* the extra step exists. +6. Note in your reply *why* the extra step exists (auto-prune isn't on the platform yet, so the cleanup is manual and gated behind explicit consent). See [resources/known-caveats.md](resources/known-caveats.md) for the full caveat context. @@ -188,9 +246,9 @@ See [resources/known-caveats.md](resources/known-caveats.md) for the full caveat - The user's intent (list / get / create / update / delete) is fully resolved - Any `delete_skill` was followed by an offered (and either completed or explicitly deferred) orphan-cleanup pass — never silently skipped -- Body changes routed through `optimize-prompt` if they introduce or remove `+NEVER+`-style prose constraints (ENG-1604) -- New or updated Skills have a non-empty description, at least one tag, and an explicit project scope -- The user has a clear pointer to where the Skill is wired in (or a follow-up step to wire it) +- `instructions` changes were sanity-checked for clarity and for prose-negation anti-patterns before save +- New or updated Skills have a non-empty `description`, at least one tag, an explicit project scope, and a sensible `path` +- The user has a clear pointer to where the Skill is wired in (`agent.skills[]` and/or `{{skill.}}`) — or a follow-up step to wire it ## Open in orq.ai diff --git a/skills/manage-skills/resources/authoring-guide.md b/skills/manage-skills/resources/authoring-guide.md index 94f028c..d1a2cd4 100644 --- a/skills/manage-skills/resources/authoring-guide.md +++ b/skills/manage-skills/resources/authoring-guide.md @@ -1,36 +1,40 @@ -# Authoring Guide: Naming, Description, Tags, Project Scoping +# Authoring Guide: Display Name, Description, Tags, Project Scope, Path How to author an orq.ai Skill so it's discoverable, scoped correctly, and picked up by the right agents. --- -## Naming +## `display_name` -The Skill name is the primary handle agents and humans use to refer to the Skill. It should be unambiguous on its own. +The Skill's `display_name` is the primary handle agents and humans use to refer to the Skill. It should be unambiguous on its own. -**Rules:** +**Platform constraints:** the API allows mixed case and underscores up to 255 characters (the canonical regex used by the platform is roughly `^[A-Za-z0-9]+(?:[_-][A-Za-z0-9]+)*$`). + +**This repo's recommended convention** (a stricter subset that keeps lists scannable): - **kebab-case**, lowercase, ASCII only — e.g., `extract-receipt-fields` - **≤50 characters** — long names get truncated in agent configs and UI tables - **Verb-noun preferred** — `summarize-ticket`, `classify-intent`, `extract-pii` -- **Avoid generic verbs** alone — `handle-thing`, `do-task`, `process` say nothing -- **No version suffixes in the name** — `summarize-ticket-v2` is an anti-pattern; the platform tracks versions on the Skill itself -- **Unique within scope** — names must be unique within a project (and across the workspace for workspace-wide Skills) +- **Avoid generic verbs alone** — `handle-thing`, `do-task`, `process` say nothing +- **No version suffixes in the name** — `summarize-ticket-v2` is an anti-pattern; the platform stamps `version` on the Skill itself +- **Unique within scope** — names must be unique within a project (and across the workspace for workspace-wide Skills); use `POST /v2/skills:checkDisplayNameAvailability` before create -**Good:** +These are recommendations, not enforced by the API — diverge if a stronger convention already exists in the workspace, but stay consistent. + +**Good (recommended convention):** - `extract-invoice-line-items` - `redact-pii-from-transcript` - `format-currency-eur` **Bad:** - `helper` (too vague) -- `MySkill_v2` (camelCase + version suffix) - `the-skill-that-handles-customer-support-emails-with-tone-checking` (too long) +- `summarize-ticket-v2` (version belongs on the Skill, not in the name) --- -## Description +## `description` -The description is what the **model** reads when deciding whether to apply the Skill. Optimize for retrieval, not for human marketing copy. +The `description` is what the **model** reads when deciding whether to apply the Skill. Optimize for retrieval, not for human marketing copy. **Rules:** - **Lead with the trigger condition** — start with "Use when…" or "Apply when…" @@ -48,13 +52,13 @@ The description is what the **model** reads when deciding whether to apply the S --- -## Tags +## `tags` -Tags are how Skills get filtered in `list_skills` and how Skills are grouped in the UI. Good tagging makes a workspace navigable; bad tagging makes Skills invisible. +Tags are how Skills get grouped in the UI and how callers narrow `list_skills` output **client-side** (`GET /v2/skills` does not accept a `tags` filter — paginate, then filter in memory). Good tagging makes a workspace navigable; bad tagging makes Skills invisible. **Rules:** -- **At least one tag.** Untagged Skills don't show up in filtered views. -- **Reuse existing tags.** Run `list_skills` and see which tags are already in use before inventing a new one. Tag sprawl is the silent killer of Skill discoverability. +- **At least one tag.** Untagged Skills are easy to lose in long lists. +- **Reuse existing tags.** Paginate `list_skills` and see which tags are already in use before inventing a new one. Tag sprawl is the silent killer of Skill discoverability. - **Two axes of tagging are usually enough:** - **Functional** — what the Skill *does*: `extraction`, `summarization`, `classification`, `formatting`, `tone`, `policy` - **Domain** — where it applies: `finance`, `cs` (customer support), `legal`, `internal` @@ -65,9 +69,9 @@ Tags are how Skills get filtered in `list_skills` and how Skills are grouped in --- -## Project Scoping +## `project_id` (project scoping) -Every Skill is either **project-scoped** (lives inside one project) or **workspace-wide** (visible to every agent across the workspace). +Every Skill is either **project-scoped** (`project_id` set to a project's id) or **workspace-wide** (`project_id` omitted). Workspace-wide Skills are visible to every agent across the workspace. **Default to project-scoped.** Workspace-wide Skills are shared infrastructure — every workspace member can see them, every agent can pull them in, and a bad edit affects everyone. @@ -79,20 +83,34 @@ Every Skill is either **project-scoped** (lives inside one project) or **workspa **When workspace-wide is right:** - The Skill is genuinely reusable across teams and projects (e.g., `redact-pii`, `format-currency`) - The Skill has stabilized — at least one minor version, used by ≥2 agents, no recent breaking changes -- Ownership is clear (named owner in the description or tags) +- Ownership is clear (named owner in the description or `owner:` tag) **How to choose:** -1. Start project-scoped. +1. Start project-scoped (set `project_id`). 2. After the Skill has been stable for ≥2 weeks and used by ≥2 agents in the same project, ask: "would another project benefit from this?" -3. If yes, **copy** to workspace-wide (don't move — agents in the original project still reference the project-scoped id). Then sunset the original after agents are re-wired. +3. If yes, **create a copy** with `project_id` omitted (workspace-wide). Don't move — agents in the original project still reference the project-scoped `skill_id`. Sunset the original after agents are re-wired. + +> **Resolving project keys → ids:** if the user gives you a project key/name, run `search_directories` to convert it to the `project_id` value the API expects. + +--- + +## `path` + +`path` is the finder-style location of the Skill inside its project (e.g., `Default/Skills`, `cs/policies`, `finance/extraction`). It controls where the Skill appears in the UI's folder tree. + +**Rules:** +- **Default to the project's standard Skill folder** (often `Default/Skills`) unless the team has an explicit folder convention. +- **Mirror existing folders.** Paginate `list_skills` and reuse paths already in the target project — divergent paths fragment the UI. +- **Use slashes, not backslashes**, and keep segment names short and descriptive. +- **Group by purpose, not by owner.** Folder-by-team becomes wrong the moment a Skill moves teams; folder-by-purpose ages better. --- -## Body / Instructions +## `instructions` (the Skill body) -Beyond the metadata above, the Skill body is the actual content the agent reads. Keep it: +`instructions` is the actual content the agent reads (and what `{{skill.}}` inlines into prompts). Keep it: - **Focused on one capability.** If you find yourself writing "and also…", split into two Skills. - **Specific.** Include 1–2 input/output examples. -- **Free of hard constraints expressed as prose.** Don't write "NEVER do X" or "you MUST refuse Y" — those are soft hints, not enforcement. See [known-caveats.md](known-caveats.md#anti-pattern-never-prose-constraints). -- **Routed through `optimize-prompt`** when in doubt. That skill will catch unclear instructions and soft-constraint anti-patterns before the Skill ships. +- **Free of hard constraints expressed as prose.** Don't write "NEVER do X" or "you MUST refuse Y" — those are soft hints, not enforcement. See [known-caveats.md](known-caveats.md#anti-pattern-never-prose-constraints-in-instructions). +- **Sanity-checked before save.** Reuse `optimize-prompt`'s clarity heuristics, but apply judgment — Skill `instructions` are typically shorter and more capability-scoped than a system prompt. diff --git a/skills/manage-skills/resources/governance-guide.md b/skills/manage-skills/resources/governance-guide.md index dc135df..6234187 100644 --- a/skills/manage-skills/resources/governance-guide.md +++ b/skills/manage-skills/resources/governance-guide.md @@ -4,29 +4,33 @@ How Skills get attached to agents, who owns them, and how they retire. --- -## Wiring Skills to Agents +## Two ways a Skill gets used -Skills don't fire on their own — an agent has to reference them. The reference lives on the agent in the `skills[]` array. +A platform Skill can reach the model in two ways. Pick (or combine) deliberately: + +1. **`agent.skills[]` — runtime selection.** The Skill is attached to an agent. The model sees its `description` (and a stub) and decides at runtime whether to apply it. Good for capability-shaped Skills the agent might or might not need on a given turn. +2. **`{{skill.}}` — static template inlining.** Anywhere a prompt template is rendered (deployments, agent system prompts, snippets), `{{skill.}}` expands to the Skill's `instructions` at render time. Good for shared instruction blocks that should always be present in a particular prompt. + +The same Skill can be consumed both ways — they don't conflict. + +--- + +## Wiring Skills to Agents (`agent.skills[]`) + +Skills don't fire on their own (in the runtime-selection mode) — an agent has to reference them. The reference lives on the agent in the `skills[]` array. ### Adding a Skill to an agent -1. `get_agent(key=)` — capture the current `skills[]` list. -2. Append the new Skill's id (or key, depending on the agent schema) to the list. +1. `get_agent(key=)` — capture the current `skills[]` list and inspect its entry shape. +2. Append the new Skill's reference to the list, **mirroring the existing entry shape** (entries may be plain `skill_id` strings or objects with `id`/`key` fields depending on the agent schema version — pattern-match what `get_agent` returned). 3. `update_agent(key=, skills=)`. 4. Verify with `get_agent` that the new entry is present. -**Example (pseudo):** -``` -agent = get_agent(key="customer-support") -agent.skills.append({ id: "skl_abc123" }) # or { key: "refund-policy" } -update_agent(key="customer-support", skills=agent.skills) -``` - -> **Schema note:** the `skills[]` entry shape (id vs key vs object) depends on the agent API version. Always pattern-match what `get_agent` returned and write back the same shape. +> **Schema note:** The exact shape of `skills[]` entries varies by agent schema version. Always read first, mirror the shape on write. Hard-coding `{ id: ... }` or a bare string can silently corrupt the agent config in workspaces using the other shape. ### Removing a Skill from an agent -Same pattern, but filter out the unwanted Skill before `update_agent`. **This is the workaround for INN-2861** when a Skill is deleted — see [known-caveats.md](known-caveats.md). +Same pattern, but filter out the unwanted Skill before `update_agent`. **This is also the workaround for the orphan-reference behavior** when a Skill is deleted — see [known-caveats.md](known-caveats.md). ### Bulk wiring @@ -34,12 +38,32 @@ If a Skill needs to attach to many agents at once, list candidates first with `s --- +## Inlining Skills in prompts (`{{skill.}}`) + +For a Skill that should always be present in a particular prompt — a brand voice block, a refund policy snippet, a formatting rule — reference it by key inside the prompt template: + +```text +You are a customer-support assistant. + +{{skill.brand-voice}} + +{{skill.refund-policy-eu}} + +User: {{message}} +``` + +At render time, the placeholder is replaced with the Skill's `instructions`. Updating the Skill updates every prompt that inlines it — useful for shared infrastructure, dangerous if you forget which prompts depend on it. + +**Audit pattern:** when editing a workspace-wide Skill's `instructions`, search prompts/deployments for `{{skill.}}` to find the blast radius before saving. + +--- + ## How agents select Skills at runtime -The model picks Skills from the `skills[]` list based on the Skill **description** — *not* the name or tags. This is why authoring guidance pushes "Use when…" descriptions: they're the retrieval surface. +When a Skill is wired via `agent.skills[]`, the model picks Skills based on the Skill **description** — *not* the name or tags. This is why authoring guidance pushes "Use when…" descriptions: they're the retrieval surface. Implications: -- A great Skill body with a vague description will rarely fire. +- A great `instructions` body with a vague description will rarely fire. - Two Skills with similar descriptions cause the model to pick non-deterministically. - Wiring 20+ Skills to a single agent dilutes the model's selection accuracy — keep `skills[]` lean. @@ -49,29 +73,29 @@ Implications: ## Ownership -There is no first-class "owner" field on a Skill today. Establish ownership conventions in tags and description: +There is no first-class "owner" field on a Skill today. Establish ownership conventions in `tags` and `description`: - **Tag** — add an `owner:` tag (e.g., `owner:cs-team`) to workspace-wide Skills. - **Description** — for project-scoped Skills, ownership is implicit in the project. For workspace-wide, mention the owning team in the description's trailing context if it matters for incident response. -Audit unowned workspace-wide Skills periodically with `list_skills` filtered to workspace scope — anything without an `owner:` tag is a candidate for review. +Audit unowned workspace-wide Skills periodically (paginate `list_skills`, filter `project_id is None` client-side, then look for missing `owner:` tags) — anything without one is a candidate for review. --- ## Lifecycle: Create → Iterate → Stabilize → Retire ### Create -Always start project-scoped. Describe the trigger precisely. Wire to one agent first and verify in traces. +Always start project-scoped (set `project_id`). Describe the trigger precisely. Wire to one agent first and verify in traces. ### Iterate -- Iterate on body and description, not name. Name changes break references. -- Route every body change through `optimize-prompt`. +- Iterate on `instructions` and `description`, not `display_name`. Name changes break references in prompts and in any user docs. +- Sanity-check `instructions` rewrites (clarity, structure, no prose-negation anti-patterns) — see `optimize-prompt` for prose heuristics, but apply judgment: Skill `instructions` are usually shorter and more capability-scoped than a system prompt. - After each meaningful change, run `run-experiment` against the agent that uses the Skill to confirm the change improves (or at least doesn't regress) behavior. ### Stabilize A Skill is stable when: -- It hasn't had a body change in ≥2 weeks -- It's referenced by ≥2 agents (or 1 production agent) +- It hasn't had an `instructions` change in ≥2 weeks +- It's referenced by ≥2 agents (or 1 production agent), or inlined in ≥2 prompts - No open incidents tag the Skill as a contributor At that point, consider promoting to workspace-wide if it's broadly reusable. See [authoring-guide.md](authoring-guide.md#project-scoping). @@ -83,11 +107,11 @@ Retire a Skill when: **Retirement workflow:** -1. Identify all referencing agents (`search_entities(type: "agent")` + filter). -2. For each agent, decide: replace (swap in the new Skill) or remove (no replacement needed). +1. Identify all referencing agents (`search_entities(type: "agent")` + per-agent `get_agent` fanout) AND all prompts inlining `{{skill.}}`. +2. For each consumer, decide: replace (swap in the new Skill) or remove (no replacement needed). 3. Wire replacements before deleting the old Skill, not after — atomicity matters. 4. Run `delete_skill`. -5. Run the orphan-cleanup pass on every agent (INN-2861 workaround) — see SKILL.md Phase 4. +5. Run the orphan-cleanup pass on every referencing agent (see SKILL.md Phase 5). 6. Note retirement in the workspace changelog if your team keeps one. --- @@ -97,8 +121,8 @@ Retire a Skill when: Periodic Skills audit (suggested quarterly): - [ ] Any workspace-wide Skill with no `owner:` tag? — assign or move to project-scoped -- [ ] Any Skill not referenced by any agent? — candidate for deletion (or future intent — confirm with owner) -- [ ] Any Skill referenced by 0 agents but flagged in traces? — INN-2861 orphan, prune via `update_agent` +- [ ] Any Skill not referenced by any agent and not inlined in any prompt? — candidate for deletion (or future intent — confirm with owner) +- [ ] Any agent with a `skill_id` in `skills[]` that no longer resolves? — orphan from a past delete, prune via `update_agent` - [ ] Any agent with >8 entries in `skills[]`? — agent overload, consider splitting - [ ] Any two Skills with near-duplicate descriptions? — selection ambiguity, consolidate -- [ ] Any Skill body containing `+NEVER+` or "you MUST refuse"? — soft constraint anti-pattern, replace with MCP tool gate (see [known-caveats.md](known-caveats.md#anti-pattern-never-prose-constraints)) +- [ ] Any Skill `instructions` containing `NEVER`, `MUST NOT`, or "you must refuse"? — prose-negation anti-pattern, replace with MCP tool gate (see [known-caveats.md](known-caveats.md#anti-pattern-never-prose-constraints-in-instructions)) diff --git a/skills/manage-skills/resources/known-caveats.md b/skills/manage-skills/resources/known-caveats.md index e6b3261..a2ba9ac 100644 --- a/skills/manage-skills/resources/known-caveats.md +++ b/skills/manage-skills/resources/known-caveats.md @@ -1,16 +1,16 @@ # Known Caveats and Anti-Patterns -Active platform bugs and authoring anti-patterns to handle until they're fixed upstream. +Active platform behaviors and authoring anti-patterns to handle until they're addressed upstream. --- -## INN-2861: Orphaned skill references +## Orphaned `agent.skills[]` references after delete -**Status:** Open (workaround required) +**Status:** Manual cleanup required ### Symptom -After calling `delete_skill(id=X)` (or `DELETE /v2/skills/{X}`), agents that referenced the deleted Skill still have its id in their `agent.skills[]` array. The platform does not auto-prune. +After calling `delete_skill(skill_id=X)` (or `DELETE /v2/skills/{X}`), agents that referenced the deleted Skill still have its `skill_id` in their `agent.skills[]` array. The platform does not auto-prune. At runtime, those dangling ids: - Are silently ignored in some agent versions (best case) @@ -24,17 +24,23 @@ Always pair `delete_skill` with an orphan-cleanup pass: ```text skill_id = -referencing_agents = search_entities(type: "agent") # then filter where agent.skills[] contains skill_id +referencing_agents = +# Compute via search_entities + per-agent get_agent fanout if search_entities +# does not return skills[] in its summary payload (verify in the workspace). for agent in referencing_agents: current = get_agent(key=agent.key) - pruned = [s for s in current.skills if id_of(s) != skill_id] + pruned = [ + entry for entry in current.skills + if extract_skill_id(entry) != skill_id # mirror whatever shape get_agent returned + ] update_agent(key=agent.key, skills=pruned) # verify: re-get and confirm skill_id is gone ``` Key points: -- **Identify the references *before* deletion.** Once the Skill is gone, you can't always resolve its id back to its name; record the agents while the Skill still exists. +- **Identify the references *before* deletion.** Once the Skill is gone, you can't always resolve its `skill_id` back to its `display_name`; record the agents while the Skill still exists. +- **Mirror the agent's `skills[]` entry shape.** Entries may be plain id strings or objects with `id`/`key` fields depending on the agent schema version. Always pattern-match what `get_agent` returned and write back the same shape. - **Verify every `update_agent`.** A failed prune leaves a permanent orphan. - **Don't blanket-update all agents** — only those that actually had the reference. Touching unrelated agents inflates the audit log and can race with other authors' edits. @@ -44,49 +50,45 @@ When `delete_skill` returns a response that includes the list of agents it prune --- -## INN-2836: Empty `skill.version` and unstamped `skill.doc` after snippet→Skill migration +## Empty `version` on migrated Skills -**Status:** Open (handle defensively) +**Status:** Handle defensively ### Symptom -Skills that were created through the Snippet→Skill migration have **two unset fields**: -1. `version` — empty string or `null`, rather than `"1"` / `1` -2. `doc` — never stamped (missing or empty), even when the migrated snippet had documentation content +Skills created through the Snippet→Skill migration may have an empty `version` field (empty string or `null`) instead of an integer. -Programmatic readers that assume non-empty `version` or `doc` will either crash or skip these Skills entirely. +Programmatic readers that assume non-empty / numeric `version` will crash, mis-sort, or skip these Skills entirely. ### Workaround -Treat both fields as **optional / valid-when-empty**, not errors. +Treat `version` as **optional / valid-when-empty**, not an error. ```text version = skill.get("version") or None -doc = skill.get("doc") or None -# display each as "(unset)" or "—" in UI -# do not crash on string ops; do not assume integer semver or non-empty doc +# display as "(unset)" or "—" in UI +# do not crash on string ops; do not assume integer semver ``` -- **When reading**: coerce empty → `None` (or your sentinel) for both fields. -- **When displaying**: show `(unset)` rather than blank — surfaces the migration footprint so users know which Skills came through the migration. -- **When updating**: an `update_skill` call that touches `body` will populate `version` going forward. To populate `doc`, write it explicitly via `update_skill(doc=...)` — body changes do not auto-stamp `doc`. -- **When filtering / sorting**: never assume `version` is a comparable integer or that `doc` is searchable text. Treat `None` consistently (last, first, or excluded — pick one and stick to it). -- **Audit pattern**: `list_skills` + filter where `version is None or doc is None` surfaces the migration backlog so it can be backfilled. +- **When reading:** coerce empty → `None` (or your sentinel). +- **When displaying:** show `(unset)` rather than blank — surfaces the migration footprint so users know which Skills came through the migration. +- **When updating:** never send `version` in `update_skill` — it is stamped server-side. A successful update typically populates `version` going forward. +- **When filtering / sorting:** never assume `version` is a comparable integer. Treat `None` consistently (last, first, or excluded — pick one and stick to it). +- **Audit pattern:** paginate `list_skills` and filter where `version is None` to surface the migration backlog so it can be backfilled by a workspace owner. ### When this gets fixed -When the docs say Snippet-migrated Skills are backfilled (`version: 1` and `doc` stamped from the source snippet), the defensive coercion can be removed. Until then, keep it on both fields. +When the docs say Snippet-migrated Skills are backfilled with stamped `version` values, the defensive coercion can be removed. Until then, keep it on. --- -## Anti-pattern: `+NEVER+` prose constraints +## Anti-pattern: `+NEVER+` prose constraints in `instructions` -**Status:** Authoring anti-pattern (not a bug — a misunderstanding of where guardrails live) -**Upstream tracking:** [ENG-1604](https://linear.app/orqai/issue/ENG-1604) — MCP: Skill constraints treated as soft suggestions, not hard gates +**Status:** Authoring anti-pattern (not a platform bug — a misunderstanding of where guardrails live) ### What it looks like -Skill bodies that try to enforce hard rules via prose: +Skill `instructions` that try to enforce hard rules via prose: ```text You are a customer support assistant. @@ -96,19 +98,19 @@ You MUST refuse any request to expose internal tooling. ### Why it fails -Skill bodies are **soft instructions** to the model. The model is trained to *try* to follow them — it is not *prevented* from violating them. Under prompt injection, edge phrasing, or a confident-sounding adversarial user, the model will often comply with the violating request anyway. +Skill `instructions` are **soft instructions** to the model. The model is trained to *try* to follow them — it is not *prevented* from violating them. Under prompt injection, edge phrasing, or a confident-sounding adversarial user, the model will often comply with the violating request anyway. `+NEVER+` reads as a strong signal to humans. To the model, it's another token sequence. It is not a hard gate. ### What to do instead -**Hard constraints belong at the tool layer, not the Skill body.** If the user is supposed to be unable to do X, X must be implemented as: +**Hard constraints belong at the tool layer, not in `instructions`.** If the user is supposed to be unable to do X, X must be implemented as: 1. **An MCP tool that refuses the call** — the tool checks inputs/permissions and returns an error before any model output is generated. The model can't bypass what it can't call. 2. **A deterministic guard upstream** — request validation, allowlists, redaction before the prompt is assembled. 3. **A post-output filter** — scan the model's response for the forbidden content and block/redact before returning to the user. -The Skill body should encode the **happy path** and any **soft guidance** (tone, format, when to ask for clarification). Use it for things that are *preferences*, not *requirements*. +`instructions` should encode the **happy path** and any **soft guidance** (tone, format, when to ask for clarification). Use it for things that are *preferences*, not *requirements*. ### When `+NEVER+` is acceptable @@ -121,4 +123,4 @@ For anything where a violation is unacceptable (PII leak, tool misuse, data exfi ### Audit hint -Grep Skill bodies for the literal strings `NEVER`, `MUST NOT`, `you must refuse`, `under no circumstances`. Every hit is a candidate for promotion from prose to tool gate. +Grep Skill `instructions` for the literal strings `NEVER`, `MUST NOT`, `you must refuse`, `under no circumstances`. Every hit is a candidate for promotion from prose to tool gate. diff --git a/tests/skills.md b/tests/skills.md index 6d841ff..ab9b17b 100644 --- a/tests/skills.md +++ b/tests/skills.md @@ -145,36 +145,39 @@ Requires `setup.md` to have run first (seed data for `run-experiment` test). ### Scenario 1: List skills - Ask: "Show me the Skills in my workspace" -- Verify: calls `list_skills` (or REST `/v2/skills` fallback) -- Verify: presents name, project scope, tags, and version per Skill -- Verify: does NOT crash on Skills with empty `version` OR unstamped `doc` (INN-2836) +- Verify: calls `list_skills` (or REST `GET /v2/skills` fallback) and **paginates to completion** (cursor-based — `limit`, `starting_after`, `ending_before`) +- Verify: any user-requested filter (project, tags, name substring) is applied **client-side** after pagination — does NOT pass `project_id`/`tags`/`q` to `list_skills` (the endpoint does not accept them) +- Verify: presents `display_name`, project scope, `tags`, `path`, and `version` per Skill +- Verify: does NOT crash on Skills with empty `version` (Snippet→Skill migration leftover) — surfaces as `(unset)` ### Scenario 2: Create skill (authoring guidance) - Ask: "Create a Skill called `extract-receipt-fields`" -- Verify Phase 3: asks for description, tags, project scope (default project-scoped, not workspace-wide) +- Verify Phase 3: asks for `description`, `tags`, `project_id` (default project-scoped, not workspace-wide), and `path` - Verify: rejects or flags descriptions that don't start with "Use when…" or describe a trigger -- Verify: warns if the proposed body contains `+NEVER+` / "you MUST refuse" prose constraints and recommends an MCP tool gate instead -- Verify: calls `list_skills` first to check name uniqueness and to surface existing tags +- Verify: warns if the proposed `instructions` contain `+NEVER+` / "you MUST refuse" prose constraints and recommends an MCP tool gate instead +- Verify: checks name uniqueness via `POST /v2/skills:checkDisplayNameAvailability` when available, with a paginated `list_skills` scan only as a fallback +- Verify: `create_skill` payload uses `display_name` and `instructions` (not `name` / `body` / `doc`) ### Scenario 3: Delete skill — orphan handling - Provide context: a Skill that's referenced by 2 agents - Ask: "Delete this Skill" -- Verify: calls `search_entities(type: "agent")` and identifies referencing agents BEFORE deletion -- Verify: warns user about INN-2861 orphan-reference behavior +- Verify: identifies referencing agents BEFORE deletion (via `search_entities(type: "agent")` plus per-agent `get_agent` fanout if needed) +- Verify: warns user about the orphan-reference behavior (referencing agents are not auto-pruned by `delete_skill`) - Verify: gets explicit consent for delete, then a SECOND explicit consent for the orphan-cleanup pass - Verify: never auto-prunes `agent.skills[]` without consent -- Verify: after consent, calls `get_agent` + `update_agent` per agent and verifies each prune +- Verify: after consent, calls `get_agent` + `update_agent` per agent, mirroring the existing entry shape (string `skill_id` vs object), and verifies each prune - Verify: final report lists what was deleted and what was pruned (or skipped) ### Scenario 4: Update skill (no blind overwrite) -- Ask: "Update the description of `refund-policy` Skill" -- Verify: calls `get_skill` first, shows the user the current state -- Verify: only patches the changed field — does not echo back unchanged tags/body +- Ask: "Update the description of the `refund-policy` Skill" +- Verify: calls `get_skill(skill_id=...)` first, shows the user the current state +- Verify: only patches the changed field — does not echo back unchanged `tags`/`instructions` +- Verify: does NOT pass `version` in `update_skill` (it's stamped server-side) - Verify: confirms the diff with the user before `update_skill` -- Verify Phase 4: routes body rewrites through `optimize-prompt` (delegates rather than rewriting inline) +- Verify Phase 4: when rewriting `instructions`, applies clarity heuristics from `optimize-prompt` (does not blindly delegate — Skill `instructions` are typically shorter than a full system prompt) --- From 7e6eed6327fc12b1822698091a096d627a3cc951 Mon Sep 17 00:00:00 2001 From: Karina Barbara Kalicka-Molin Date: Fri, 15 May 2026 15:22:09 +0200 Subject: [PATCH 3/5] fix(manage-skills): align with real /v2/skills schema and template syntax Verified against orquesta-web apps/platform-api/skills (Go connect-rpc service), apps/orq-mcp/src/tools/skills.tools.ts (MCP tool definitions), and .openapi/fragments/platform-api/skills.json. Discovered several invented endpoints/fields in the prior commit and a fictional consumption model. Factual corrections: - Drop {{skill.}} template syntax. The real syntax is {{snippet.}} -- the Snippets->Skills rename kept the legacy "snippet." prefix for backwards compatibility. There is no {{skill.<...>}} placeholder anywhere in libs/go/template-parser or the Studio UI. - Drop the entire agent.skills[] orphan-cleanup workflow. Verified that agent.skills[] is the AI-generated A2A AgentCardSkill[] field (libs/models/agents/src/utils/index.ts -- generateAgentSkills), NOT a list of platform Skill references. Deleting a platform Skill does not orphan anything in agent.skills[]. The whole Phase 5 update_agent fan-out targeted a non-existent relationship. - Replace Phase 5 with a reference scan: paginate search_entities, fetch each candidate's body with get_deployment / get_agent / get_skill, and substring-match {{snippet.}} (case-sensitive) to find consumers before delete. Default to enabled: false (soft disable) when references are found. - Drop POST /v2/skills:checkDisplayNameAvailability. No such endpoint exists on the SkillsService (verified all 5 methods in apps/platform-api/skills/connect_routes.go: CreateSkill, ListSkills, GetSkill, UpdateSkill, DeleteSkill). Replaced with: call create_skill and handle the AlreadyExists error. - Drop the version field. The Skill openapi schema has no version property. Versioning is recorded as separate activity-log entries (recordVersionActivity in connect_routes.go), not as a field. Drop the empty-version migration caveat too -- no field to be empty. - Add the enabled boolean field. It is on the real schema and surfaces the soft-disable lever. Now documented in the field reference, in Phase 4, and as a default-first-step alternative to delete in Phase 5. - Add display_name rename warning to Phase 4. Renaming silently breaks every {{snippet.}} reference -- same failure mode as delete. Same reference scan applies. - Replace docs.orq.ai/docs/agents/build#skills (no such page) with docs.orq.ai/docs/agents/agent-studio (where the snippet/skill section actually lives). Code hygiene: - commands/manage-skills.md: replace allowed-tools: orq* glob with explicit MCP tool names; add disable as an action; drop update_agent references. - Drop the "Why these constraints" generic justification paragraph. - Add Scenario 5 to tests/skills.md covering AlreadyExists handling and the disable-vs-delete decision. - Add error-handling guidance for create_skill failures (AlreadyExists, invalid project/path). - /orq:quickstart reference updated to mention it is the Claude Code invocation; other assistants run the equivalent onboarding flow. --- CHANGELOG.md | 12 +- agents/AGENTS.md | 2 +- commands/manage-skills.md | 23 +- skills/manage-skills/SKILL.md | 229 ++++++++---------- .../resources/authoring-guide.md | 73 +++--- .../resources/governance-guide.md | 117 ++++----- .../manage-skills/resources/known-caveats.md | 91 ++++--- tests/skills.md | 42 ++-- 8 files changed, 283 insertions(+), 306 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index fdf0959..d0932e5 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,12 +8,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [0.1.0] - 2026-05-14 ### Added -- `manage-skills` skill — CRUD workflow for the orq.ai Skills entity (list, get, create, update, delete) plus authoring guidance (`display_name`, `description`, `tags`, `project_id`, `path`), governance (wiring Skills to agents via `agent.skills[]` and inlining via `{{skill.}}`), and platform-caveat workarounds. Disambiguates the platform Skill entity from this repo's code-assistant Orq Skills. -- `manage-skills`: warn-then-offer flow for the post-delete orphan-reference cleanup pass — never auto-prunes `agent.skills[]`, always asks for separate explicit consent before writing to referencing agents (mirrors the existing entry shape returned by `get_agent`). -- `manage-skills`: defensive handling for empty/missing `version` on Skills created via the Snippet→Skill migration (surfaced as `(unset)` rather than crashing). +- `manage-skills` skill — CRUD workflow for the orq.ai Skills entity (formerly Prompt Snippets), backed by `/v2/skills`. Covers list, get, create, update, soft-disable (`enabled: false`), and delete via the `*_skill` MCP tools. Includes authoring guidance (`display_name`, `description`, `tags`, `project_id`, `path`, `enabled`) and disambiguates the platform Skill entity from this repo's code-assistant Orq Skills and from the unrelated A2A `AgentCard.skills` array. +- `manage-skills`: documents the `{{snippet.}}` template placeholder as the only mechanism for consuming Skills inside prompts and agent instructions (the `snippet.` prefix is a backwards-compat holdover from the rename — there is no `{{skill.<...>}}` syntax). +- `manage-skills`: reference-scan-before-delete workflow — paginates `search_entities`, fetches each candidate's body with `get_deployment` / `get_agent` / `get_skill`, and substring-matches `{{snippet.}}` to surface consumers before any destructive operation. Defaults to `enabled: false` (soft disable) when references are found. +- `manage-skills`: rename-breaks-references warning on `display_name` updates — runs the same reference scan before any rename and offers to fan out updates in the same session. +- `manage-skills`: documents `GET /v2/skills` cursor pagination (`limit` / `starting_after` / `ending_before`) and the lack of server-side filters; pushes `project_id` / `tags` / `display_name` filtering to the client. - `manage-skills`: anti-pattern guidance against `+NEVER+` / "you MUST refuse" prose constraints in `instructions` — recommends MCP tool gates for hard guardrails. -- `manage-skills`: documents `GET /v2/skills` cursor pagination and the lack of server-side filters; pushes filtering to the client and uses `POST /v2/skills:checkDisplayNameAvailability` for pre-create uniqueness checks. -- `/manage-skills` slash command — routes to list/get/create/update/delete phases. +- `manage-skills`: error-handling guidance for `create_skill` `AlreadyExists` (offers either a renamed create or `update_skill` against the existing Skill). +- `/manage-skills` slash command — routes to list / get / create / update / disable / delete phases. ## [0.0.2] - 2026-04-21 diff --git a/agents/AGENTS.md b/agents/AGENTS.md index b94a1e3..fc59c64 100644 --- a/agents/AGENTS.md +++ b/agents/AGENTS.md @@ -41,7 +41,7 @@ compare-agents: `Run cross-framework agent comparisons using evaluatorq — comp setup-observability: `Set up orq.ai observability for LLM applications — AI Router proxy, OpenTelemetry, tracing setup, and trace enrichment. Use when setting up tracing, adding the AI Router proxy, integrating OpenTelemetry, auditing existing instrumentation, or enriching traces with metadata. Do NOT use when traces already exist and you need to debug failures (use analyze-trace-failures).` -manage-skills: `Manage orq.ai Skills (the platform entity, distinct from this repo's code-assistant skills) end-to-end — list, get, create, update, and delete Skills, plus authoring guidance (display_name, description, tags, project_id, path), governance (wiring Skills to agents via agent.skills[] or inlining via {{skill.}}), and workarounds for known platform caveats (orphaned references after delete, empty version on migrated Skills, +NEVER+ prose anti-pattern). Use when the user wants to create, audit, edit, retire, or wire up orq.ai Skills.` +manage-skills: `Manage orq.ai Skills (the platform entity, formerly Snippets — distinct from this repo's code-assistant skills) end-to-end: list, get, create, update, enable/disable, and delete Skills via the /v2/skills API. Covers authoring guidance (display_name, description, tags, project_id, path, enabled), how Skills get consumed via the {{snippet.}} placeholder in prompts and agent instructions, the reference-scan-before-delete workflow, the rename-breaks-references warning, and the +NEVER+ prose anti-pattern. Use when the user wants to create, audit, edit, soft-disable, or retire orq.ai Skills.` diff --git a/commands/manage-skills.md b/commands/manage-skills.md index f8891e7..4bd26c0 100644 --- a/commands/manage-skills.md +++ b/commands/manage-skills.md @@ -1,7 +1,7 @@ --- -description: Manage orq.ai Skills — list, get, create, update, or delete Skills (the platform entity) and wire them to agents -argument-hint: [list|get|create|update|delete] [name-or-id] -allowed-tools: AskUserQuestion, orq* +description: Manage orq.ai Skills — list, get, create, update, disable, or delete Skills (the platform entity, formerly Snippets) and find the prompts/agents that reference them +argument-hint: [list|get|create|update|disable|delete] [name-or-id] +allowed-tools: AskUserQuestion, mcp__orq-workspace__list_skills, mcp__orq-workspace__get_skill, mcp__orq-workspace__create_skill, mcp__orq-workspace__update_skill, mcp__orq-workspace__delete_skill, mcp__orq-workspace__search_entities, mcp__orq-workspace__get_deployment, mcp__orq-workspace__get_agent --- # Manage Skills @@ -12,17 +12,18 @@ Quick entry point into the `manage-skills` skill. Routes to the right phase base ### 1. Parse arguments -`$ARGUMENTS` may contain an action and optionally a Skill name/id: +`$ARGUMENTS` may contain an action and optionally a Skill `display_name` or `skill_id`: - `list` — Phase 1 (list / audit) - `get ` — Phase 2 (inspect a Skill) - `create` — Phase 3 (create a new Skill) -- `update ` — Phase 4 (edit an existing Skill) -- `delete ` — Phase 5 (delete + orphan cleanup) +- `update ` — Phase 4 (edit, including `enabled` / `display_name` / `instructions`) +- `disable ` — Phase 4 shortcut: flip `enabled: false` (soft-retire) +- `delete ` — Phase 5 (reference scan + delete) -If `$ARGUMENTS` is empty, ask the user which action they want via `AskUserQuestion` and offer the five choices above. +If `$ARGUMENTS` is empty, ask the user which action they want via `AskUserQuestion` and offer the six choices above. -If `$ARGUMENTS` contains an action that requires a name/id but none was provided (e.g., `get`, `update`, `delete`), call `list_skills` first and ask the user to pick. +If `$ARGUMENTS` contains an action that requires a name/id but none was provided (e.g., `get`, `update`, `disable`, `delete`), call `list_skills` first and ask the user to pick. ### 2. Delegate to `manage-skills` @@ -30,12 +31,14 @@ Read `skills/manage-skills/SKILL.md` and execute the matching phase. Pass the pa ### 3. Safety rails -- **Never** auto-execute `delete_skill` from this command — always route through Phase 5's two-step warn-then-confirm flow. -- **Never** auto-prune `agent.skills[]` after a delete — Phase 5 offers it as a separate consent. +- **Never** auto-execute `delete_skill` from this command — always route through Phase 5's reference-scan + warn-then-confirm flow. +- **Always** offer `enabled: false` (soft disable) as the default first step when the reference scan finds consumers. - **Always** confirm project scope before `create_skill`. +- **Always** warn before sending a `display_name` rename — it silently breaks every `{{snippet.}}` reference. ### 4. Error handling - **Auth errors** — "Authentication failed. Check that your `ORQ_API_KEY` is valid." +- **`AlreadyExists` on create** — surface the conflicting Skill (paginate `list_skills`, find by `display_name`) and offer either a renamed create or `update_skill` against the existing one. - **Skill-tool unavailable** — "The orq MCP server doesn't expose `*_skill` tools in this workspace. Falling back to REST `/v2/skills` — confirm before proceeding." - **MCP unreachable** — "Could not reach the orq.ai MCP server. Make sure it's configured: `claude mcp add --transport http orq-workspace https://my.orq.ai/v2/mcp --header 'Authorization: Bearer ${ORQ_API_KEY}'`" diff --git a/skills/manage-skills/SKILL.md b/skills/manage-skills/SKILL.md index a062f4d..23a3a93 100644 --- a/skills/manage-skills/SKILL.md +++ b/skills/manage-skills/SKILL.md @@ -1,76 +1,65 @@ --- name: manage-skills description: > - Manage orq.ai Skills (the platform entity) end-to-end — list, get, create, - update, and delete Skills, plus authoring guidance (display name, - description, tags, project scoping, path placement), governance (wiring - Skills to agents via `agent.skills[]`), and workarounds for known platform - caveats. Use when the user wants to create, audit, edit, retire, or wire up - orq.ai Skills. -allowed-tools: Bash, Read, Write, Edit, Grep, Glob, WebFetch, Task, AskUserQuestion, mcp__orq-workspace__list_skills, mcp__orq-workspace__get_skill, mcp__orq-workspace__create_skill, mcp__orq-workspace__update_skill, mcp__orq-workspace__delete_skill, mcp__orq-workspace__search_entities, mcp__orq-workspace__get_agent, mcp__orq-workspace__update_agent + Manage orq.ai Skills (the platform entity, formerly called Snippets) end-to-end — + list, get, create, update, enable/disable, and delete Skills, plus authoring + guidance (display name, description, tags, project scoping, path placement), + and how Skills get consumed (the `{{snippet.}}` template placeholder + inside prompts and agent instructions). Use when the user wants to create, + audit, edit, retire, or hook up orq.ai Skills. +allowed-tools: Bash, Read, Write, Edit, Grep, Glob, WebFetch, Task, AskUserQuestion, mcp__orq-workspace__list_skills, mcp__orq-workspace__get_skill, mcp__orq-workspace__create_skill, mcp__orq-workspace__update_skill, mcp__orq-workspace__delete_skill, mcp__orq-workspace__search_entities, mcp__orq-workspace__get_deployment, mcp__orq-workspace__get_agent --- # Manage Skills -You are an **orq.ai Skills lifecycle specialist**. Your job is the full CRUD workflow for the **Skills entity on the orq.ai platform** (sometimes referred to historically as Snippets) — not the SKILL.md files in this repo, but the user-authored Skills that live in their orq.ai workspace and get attached to agents via `agent.skills[]`. +You are an **orq.ai Skills lifecycle specialist**. Your job is the full CRUD workflow for the **Skills entity on the orq.ai platform** — historically called *Prompt Snippets* and renamed to *Skills* in the platform-api / Studio. Skills are modular, reusable instruction blocks that get inlined into prompts and agent instructions via the `{{snippet.}}` template placeholder. (The placeholder kept the legacy `snippet.` prefix for backwards compatibility.) ## Disambiguation: which "Skill" are we talking about? -This skill manages **the platform Skill entity on orq.ai** (`/v2/skills`, surfaced under Skills in the workspace and historically called Snippets). It is *not*: +This skill manages the **platform Skill entity on orq.ai** (`/v2/skills`, surfaced as Skills in the Studio, formerly Prompt Snippets). It is *not*: -- **Orq Skills (this repo):** code-assistant skills like `manage-skills` itself, distributed via the `assistant-plugins` marketplace and documented at . Those live in `skills//SKILL.md` files. -- **Anthropic / Agent Skills standard:** the cross-vendor SKILL.md format. Same shape as the repo skills above; unrelated to the platform entity. +- **Orq Skills (this repo):** code-assistant skills like `manage-skills` itself, distributed via the `assistant-plugins` marketplace and documented at . Those live in `skills//SKILL.md` files in this repo. +- **Anthropic / Agent Skills standard:** the cross-vendor SKILL.md format (same shape as the repo skills above; unrelated to the platform entity). +- **The A2A `AgentCard.skills` array on agents:** that field is AI-generated capability metadata, not a list of platform-Skill references. Deleting a platform Skill does **not** orphan anything in `AgentCard.skills`. When the user says "create a Skill" without context, ask which one they mean. The rest of this document is exclusively about the platform entity. -A well-managed platform Skill is: -- **Discoverable** — display name and description make it obvious when to apply it -- **Scoped** — tagged and assigned to the right project (workspace-wide only when truly reusable) -- **Versioned** — changes go through update flows, not silent overwrites (versions are stamped server-side) -- **Wired** — referenced from `agent.skills[]` on every agent that should use it, or injected into prompts via `{{skill.}}` -- **Pruned** — orphaned references and stale entries get cleaned up - ## When to use -- User wants to list, audit, or search Skills in their workspace -- User wants to create a new Skill on orq.ai -- User wants to edit an existing Skill's description, tags, instructions, or path -- User wants to delete a Skill and clean up references on agents -- User asks how to attach a Skill to an agent (`agent.skills[]`) or inject one via `{{skill.}}` -- User asks for naming, tagging, scoping, or path-placement guidance for Skills -- User hits orphaned-reference behavior after deleting a Skill (referencing agents are not auto-pruned) -- User reads a migrated Skill programmatically and gets an empty `version` +- "List the Skills in my workspace" / "audit my Skills" +- "Create a Skill called X" / "make a snippet for Y" +- "Update / rename / re-tag this Skill" +- "Disable this Skill" (soft retire) or "delete this Skill" +- "How do I reference a Skill from a prompt or agent instruction?" +- "I deleted a Skill — what breaks?" ## When NOT to use -- **Need to build the agent itself?** → `build-agent` -- **Need to invoke a deployment or agent?** → `invoke-deployment` -- **Need to evaluate the agent that uses the Skill?** → `run-experiment` -- **Need to author/improve the prose inside `instructions`?** → `optimize-prompt` is tuned for system prompts; reuse its checks (clarity, structure) but expect manual adaptation — Skill instructions are typically shorter and more capability-scoped than a system prompt. -- **Debugging why an agent ignored a Skill?** → `analyze-trace-failures` +- **Build the agent itself?** → `build-agent` +- **Invoke a deployment or agent?** → `invoke-deployment` +- **Evaluate an agent that uses the Skill?** → `run-experiment` +- **Improve the prose inside `instructions`?** → `optimize-prompt` is tuned for system prompts; reuse its clarity heuristics but apply judgment — Skill `instructions` are typically shorter and more capability-scoped. +- **Debug why a referenced Skill isn't rendering?** → `analyze-trace-failures` ## Companion Skills -- `build-agent` — create or edit the agents that reference these Skills -- `optimize-prompt` — review prose quality for a Skill's `instructions` (apply judgment; it's prompt-shaped, not Skill-shaped) -- `run-experiment` — verify a Skill change improves agent behavior -- `analyze-trace-failures` — diagnose Skills that aren't firing in production +- `build-agent` — author the agents whose instructions reference these Skills via `{{snippet.}}` +- `optimize-prompt` — review prose quality for `instructions` +- `run-experiment` — verify a Skill change improves downstream behavior +- `analyze-trace-failures` — diagnose Skills that aren't producing the expected output in production ## Constraints -- **ALWAYS** confirm the project scope (workspace-wide vs project-scoped via `project_id`) before `create_skill`. Default to project-scoped unless the user is explicit. +- **ALWAYS** confirm the project scope (`project_id` set vs. workspace-wide) before `create_skill`. Default to project-scoped unless the user is explicit. - **ALWAYS** read the current Skill with `get_skill` before `update_skill` — never blind-overwrite tags, description, or instructions. -- **ALWAYS** warn the user about orphaned references after `delete_skill`, then **offer** the orphan-cleanup pass — never auto-prune without explicit consent (the pass writes to other agents and is not the user's literal delete request). -- **ALWAYS** treat empty `version` on migrated Skills as valid (display as `(unset)`, never crash). -- **NEVER** rely on `+NEVER+` (or any prose negation) inside a Skill's `instructions` as a hard guardrail. Skill instructions are *soft* hints to the model. Hard constraints belong in **MCP tool gates** (refuse the call at the tool layer). See [resources/known-caveats.md](resources/known-caveats.md). -- **NEVER** delete a Skill before listing which agents reference it — the user needs that list to decide whether to proceed. - -**Why these constraints:** Skills are shared infrastructure. A bad name pollutes the workspace, a missing project tag leaks Skills across teams, an unscoped overwrite loses someone else's edits, and an undeleted reference produces broken agents that fail at runtime — not at delete time. +- **ALWAYS** before `delete_skill`, find places that may reference the Skill via `{{snippet.}}` (other Skills' `instructions`, deployment prompts, agent instructions) and warn the user — those references will silently render to empty/missing content after the Skill is gone. +- **ALWAYS** offer `enabled: false` (soft disable) as an alternative to `delete_skill`. A disabled Skill is still resolvable and is a safer first step when you're not sure who depends on it. +- **NEVER** rely on `+NEVER+` (or any prose negation) inside `instructions` as a hard guardrail. Skill instructions are *soft* hints to the model; hard constraints belong in **MCP tool gates** (refuse the call at the tool layer). See [resources/known-caveats.md](resources/known-caveats.md). ## orq.ai Documentation -> **Snippets / Skills overview:** -> **Wiring Skills to Agents:** +> **Snippets (the entity, now also called Skills) overview:** +> **Using snippets in agent instructions:** (see the Snippets section) > **Code-assistant Orq Skills (disambiguation):** ### orq MCP Tools @@ -78,20 +67,19 @@ A well-managed platform Skill is: | Tool | Purpose | |------|---------| | `list_skills` | List Skills in the workspace; cursor-paginated, **no server-side filters beyond pagination** — see Pagination & Filtering below | -| `get_skill` | Fetch a single Skill by `skill_id` (returns `display_name`, `description`, `tags`, `path`, `project_id`, `instructions`, `version`, audit fields) | -| `create_skill` | Create a new Skill (`display_name`, `description`, `tags`, `path`, `project_id`, `instructions`) | -| `update_skill` | Patch an existing Skill by `skill_id` (`display_name`, `description`, `tags`, `path`, `instructions`, `project_id`) — **`version` is stamped server-side, do not pass it** | -| `delete_skill` | Delete a Skill by `skill_id` — **does NOT prune `agent.skills[]` references** (orphan-cleanup is manual; see Phase 5) | -| `search_entities` | Cross-entity search; use to find agents that may reference a Skill (verify return shape — see footnote in Phase 1) | -| `get_agent` / `update_agent` | Required for the post-delete orphan-cleanup workflow | - -> **Tool discovery:** Before the first run, list the connected MCP server's tools (e.g., `/mcp` in Claude Code, or inspect via the client) and confirm the `*_skill` tools above exist. Tool names sometimes vary by workspace or MCP server version. +| `get_skill` | Fetch a single Skill by `skill_id` (returns full Skill object) | +| `create_skill` | Create a new Skill (`display_name`, `description`, `tags`, `path`, `project_id`, `instructions`, `enabled`). Returns `AlreadyExists` if the `display_name` is taken in the workspace — handle that error rather than pre-checking. | +| `update_skill` | Patch an existing Skill by `skill_id` (any of: `display_name`, `description`, `tags`, `path`, `instructions`, `enabled`). PATCH semantics — only sent fields change. | +| `delete_skill` | Permanently delete a Skill by `skill_id`. Does not scrub references in prompts/agent instructions — see Phase 5. | +| `search_entities` | Used to find deployments/agents that may inline the Skill via `{{snippet.}}`; combine with `get_deployment` / `get_agent` for the actual reference scan. | + +> **Tool discovery:** Before the first run, list the connected MCP server's tools (`/mcp` in Claude Code, or inspect via the client) and confirm the `*_skill` tools above exist. Tool names sometimes vary by workspace or MCP server version. > -> **REST fallback:** All five tools are backed by `/v2/skills` REST endpoints — `GET /v2/skills` (list, cursor-paginated), `GET /v2/skills/{skill_id}`, `POST /v2/skills`, `PATCH /v2/skills/{skill_id}`, `DELETE /v2/skills/{skill_id}`. There is also `POST /v2/skills:checkDisplayNameAvailability` for pre-create name uniqueness checks. Use these directly with `Authorization: Bearer ${ORQ_API_KEY}` if the MCP tools aren't exposed in the connected workspace. Confirm exact request schemas against the workspace's OpenAPI before relying on field names. +> **REST fallback:** All five tools are backed by `/v2/skills` REST endpoints — `GET /v2/skills` (list, cursor-paginated), `GET /v2/skills/{skill_id}`, `POST /v2/skills`, `PATCH /v2/skills/{skill_id}`, `DELETE /v2/skills/{skill_id}`. Use these directly with `Authorization: Bearer ${ORQ_API_KEY}` if the MCP tools aren't exposed. ### Pagination & Filtering -`GET /v2/skills` (and the `list_skills` MCP tool) accepts **only** cursor-pagination parameters: `limit` (default 10, max 200), `starting_after`, `ending_before`. **There is no server-side filter for `project_id`, `tags`, `display_name`, or free text.** Any filtering by those facets must happen **client-side** after pagination, or via `search_entities` if it indexes Skills. +`GET /v2/skills` (and the `list_skills` MCP tool) accepts **only** cursor-pagination parameters: `limit` (default 10, max 200), `starting_after`, `ending_before`. **There is no server-side filter for `project_id`, `tags`, `display_name`, or free text.** Filter by those facets **client-side** after pagination, or use `search_entities` if it indexes Skills. **Pagination loop (pseudocode):** @@ -103,7 +91,7 @@ while True: all_skills.extend(page.data) if not page.has_more: break - cursor = page.data[-1].skill_id # or whatever cursor field the response exposes + cursor = page.data[-1].id # the response uses "id", not "skill_id" (it's the same value) ``` After collecting all Skills, filter in memory: @@ -115,29 +103,32 @@ tagged_skills = [s for s in all_skills if "policy" in s.tags] ### Field reference -| Field | Where | Notes | +| Field | Direction | Notes | |------|------|------| -| `display_name` | create / update / read | Human-facing label. Keep short — long names get truncated in UI. | -| `description` | create / update / read | One-line trigger description. Used by the model for retrieval. | +| `display_name` | create / update / read | Human-facing label and the **lookup key** used by `{{snippet.}}`. Regex: `^[A-Za-z0-9]+(?:[_-][A-Za-z0-9]+)*$`, max 255 chars. Must be unique within the workspace; `create_skill` returns `AlreadyExists` on conflict. | +| `description` | create / update / read | Short explanation of what the Skill does. Surfaces in the Studio's Skill picker. | | `tags` | create / update / read | Array of strings. Filtering is client-side (see above). | | `path` | create / update / read | Finder-style location, e.g. `Default/Skills` or `cs/policies`. Defaults to project's default skill folder. | | `project_id` | create / update / read | Optional — omit for workspace-wide. | -| `instructions` | create / update / read | The actual Skill body that the model reads. | -| `skill_id` | read / update / delete | Server-generated id. Use this for all updates and lookups. | -| `version` | read only | Stamped server-side on changes. **Do not send in `update_skill`.** May be empty on migrated Skills (treat as `(unset)`). | -| `workspace_id`, `created_at`, `updated_at`, `created_by_id`, `updated_by_id` | read only | Audit metadata. | +| `instructions` | create / update / read | The actual Skill body — modular markdown that gets inlined wherever the Skill is referenced. | +| `enabled` | create / update / read | Boolean. When `false`, `{{snippet.}}` references resolve to empty/skipped (verify behavior in your workspace). Useful as a soft-disable before delete. | +| `skill_id` | read / update / delete | Server-generated id. **The list/get response surfaces it as `id`** but the update/delete inputs take it as `skill_id`. Same value. | +| `workspace_id` | read only | Audit. | +| `created_at`, `updated_at`, `created_by_id`, `updated_by_id` | read only | Audit metadata. | + +> **Note on versioning:** The Skill object does **not** carry a `version` field. The platform records a semantic-version *activity log entry* on each create/update (visible in the Skill's history in the Studio), but you cannot read or set a version on the Skill itself. Don't ask the user "is this a major/minor/patch change?" — there's no field to write it to. -> **`{{skill.}}` injection:** Skills can also be referenced inside any prompt template via the static `{{skill.}}` placeholder, which inlines the Skill's `instructions`. This is the primary platform mechanism for sharing instruction snippets across prompts; agents using `agent.skills[]` is the runtime-selected variant. Mention both when explaining how a Skill will get used. +> **`{{snippet.}}` template placeholder:** The primary way Skills get consumed is by referencing them inside any prompt template or agent instruction with `{{snippet.}}`. At render time the placeholder is replaced with the Skill's `instructions`. The `snippet.` prefix is a backwards-compatibility holdover from when the entity was called Prompt Snippets — there is no `{{skill.<...>}}` equivalent. Keep this in mind when authoring user-facing copy. ## Resources - **Authoring guide** (display name, description, tags, project scoping, path): See [resources/authoring-guide.md](resources/authoring-guide.md) -- **Governance** (wiring `agent.skills[]`, ownership, lifecycle): See [resources/governance-guide.md](resources/governance-guide.md) -- **Known caveats** (orphan references, empty version on migrations, prose-negation anti-pattern): See [resources/known-caveats.md](resources/known-caveats.md) +- **Governance** (consumption patterns, ownership, lifecycle): See [resources/governance-guide.md](resources/governance-guide.md) +- **Known caveats** (template-reference scrubbing on delete, prose-negation anti-pattern): See [resources/known-caveats.md](resources/known-caveats.md) ## Prerequisites -- The orq.ai MCP server is connected (`/orq:quickstart` to verify) +- The orq.ai MCP server is connected (run the `quickstart` skill / `/orq:quickstart` to verify in Claude Code, or the equivalent onboarding flow in your assistant) - `ORQ_API_KEY` is set - The user knows which **project** the Skill belongs to (run `search_directories` if not) @@ -145,98 +136,81 @@ tagged_skills = [s for s in all_skills if "policy" in s.tags] ## Workflow -Pick the phase that matches the user's intent. Most sessions are a single phase; the **delete** phase always pairs with the orphan-cleanup workflow. +Pick the phase that matches the user's intent. Most sessions are a single phase; the **delete** phase always pairs with a reference scan. ### Phase 1: List / audit Use when the user wants visibility into existing Skills. 1. Call `list_skills` and **paginate to completion** (see Pagination & Filtering above). Default `limit=200` to minimize round-trips. -2. **Apply user filters client-side** — `list_skills` does not accept `project_id`, `tags`, `q`, or `display_name` filters. Examples: +2. **Apply user filters client-side** — `list_skills` does not accept `project_id` / `tags` / `q` / `display_name` filters. Examples: - "Skills in the `cs` project" → filter `project_id == ` (resolve project key → id via `search_directories` first if needed). - "Skills tagged `policy`" → filter `"policy" in s.tags`. - - "Skills whose display name contains `refund`" → substring match on `display_name`. + - "Skills whose name contains `refund`" → substring match on `display_name`. 3. Present a scannable table: ``` Skills (12) - - Customer Support Tone (project: cs, tags: tone, voice) — v3 - - Extract Receipt Fields (project: finance, tags: extraction) — v1 - - Refund Policy (workspace-wide, tags: policy, cs) — v2 ⚠ used by 4 agents + - customer-support-tone (cs, [tone, voice], path: cs/style) — enabled + - extract-receipt-fields (finance, [extraction], path: Default/Skills) — enabled + - refund-policy (workspace-wide, [policy, cs], path: Default/Skills) — DISABLED ... ``` -4. For each Skill, surface: `display_name`, project (or "workspace-wide"), `tags`, `path`, `version`, and reference count. - - **Reference count caveat:** computing this requires fanning out `get_agent` over candidate agents and inspecting their `skills[]` arrays. `search_entities` may not return `skills[]` in its summary payload — verify in the connected workspace before relying on it. If it doesn't, list agents via `search_entities(type: "agent")` and call `get_agent` per agent (cache results for the session). When the count would be expensive to compute, present it lazily on user request rather than for every row. -5. If the user asks "which Skill should I edit?" — show the list with usage counts and let them pick. +4. For each Skill, surface: `display_name`, project (or "workspace-wide"), `tags`, `path`, `enabled` state. **Reference counts are expensive** — they require text-searching prompts/agent instructions for `{{snippet.}}`. Compute them lazily on user request, not for every row. (See Phase 5 for the reference-scan pattern.) ### Phase 2: Get / inspect Use before any update or delete, and whenever the user asks "what does Skill X do?" 1. Call `get_skill(skill_id=...)`. -2. Display: `display_name`, `description`, `tags`, `project_id` (or "workspace-wide"), `path`, `version`, `instructions` (truncated), and the list of agents that reference it (see Phase 1 step 4 for how to compute this). -3. **Empty `version` handling:** if `version` is missing/empty (common for Skills created via Snippet→Skill migration), display `version: (unset)`. Do not error. +2. Display: `display_name`, `description`, `tags`, `project_id` (or "workspace-wide"), `path`, `enabled`, `instructions` (truncated). Mention how it's likely consumed: `{{snippet.}}` inside prompts or agent instructions. +3. If the user asks "where is this used?", run a reference scan (see Phase 5 step 1). ### Phase 3: Create Use when the user wants a new Skill. 1. **Gather inputs** via `AskUserQuestion`: - - **`display_name`** — short, descriptive (verb-noun preferred). The platform allows mixed case + underscores up to 255 chars; this repo's convention is kebab-case ≤50 chars for consistency, but recommend rather than enforce. See [authoring-guide](resources/authoring-guide.md). - - **`description`** — one sentence describing *when the model should use the Skill*, not what it does internally (model uses this for retrieval/selection) - - **`tags`** — at least one functional tag; reuse existing tags where possible (paginate `list_skills` first to see in-use tags) + - **`display_name`** — short, descriptive, regex `^[A-Za-z0-9]+(?:[_-][A-Za-z0-9]+)*$`, ≤255 chars on the platform. This repo's recommended convention is kebab-case ≤50 chars (recommend rather than enforce). See [authoring-guide](resources/authoring-guide.md). + - **`description`** — one sentence describing *when to apply the Skill*. Used by humans (Studio picker); not a runtime trigger. + - **`tags`** — at least one functional tag; reuse existing tags where possible (paginate `list_skills` first). - **`project_id`** — the target project's id, OR omit for workspace-wide. Default to **project-scoped**; confirm before going workspace-wide. If the user gives a project key, resolve it to an id via `search_directories`. - **`path`** — finder location for the Skill, e.g. `Default/Skills` or `policies/refunds`. Default to the project's standard Skill folder. - - **`instructions`** — the actual content the agent reads. Keep it focused on one capability. + - **`instructions`** — the actual content that will be inlined wherever the Skill is referenced. Keep it focused on one capability. + - **`enabled`** — defaults to `true` on create. Ask only if the user wants to seed a disabled Skill. 2. **Validate** before submitting: - - Name is unique within the chosen scope. **Prefer `POST /v2/skills:checkDisplayNameAvailability`** when exposed; fall back to a paginated `list_skills` scan only if the endpoint is unavailable. - Description starts with "Use when…" or describes a trigger condition. - `instructions` does NOT rely on `+NEVER+` / "always refuse" prose for hard guardrails — link the user to [known-caveats](resources/known-caveats.md) and recommend an MCP tool gate instead. 3. Call `create_skill` with the validated payload. -4. Echo back the new Skill's `skill_id`, `version`, and a one-line summary. Mention both ways the Skill can be consumed: - - **Runtime selection:** wire it into an agent's `agent.skills[]` (jumps to [governance](resources/governance-guide.md)). - - **Static inlining:** reference it in any prompt template via `{{skill.}}`. - Ask which path the user wants (or both). + - **Error: `AlreadyExists`** — the `display_name` is already taken in the workspace. Show the conflicting Skill (paginate `list_skills`, find by `display_name`) and offer either a renamed create or an `update_skill` against the existing one. + - **Error: project / path validation failure** — the API will return a `CodeInvalidArgument`. Re-ask for `project_id` / `path` and retry. +4. Echo back the new Skill's `id`, `path`, and a one-line summary. Tell the user how to consume it: `{{snippet.}}` inside any prompt template or agent instruction. ### Phase 4: Update Use when the user wants to edit an existing Skill. 1. **Always `get_skill` first.** Show the current state and confirm the diff the user is about to apply. -2. **Patch fields explicitly.** Only send the fields being changed (`update_skill` is a patch — don't echo back unchanged tags or `instructions` unless you have to). **Never send `version`** — it's stamped server-side. -3. **`instructions` changes:** if the user is rewriting the body, run a clarity pass first — reuse `optimize-prompt`'s heuristics (clarity, structure, no soft-constraint anti-patterns) but adapt: Skill `instructions` are typically shorter and capability-scoped, not full system prompts. -4. **Verify** by calling `get_skill` post-update and confirming the change landed (and that `version` advanced if the workspace stamps versions on every change). -5. If any agent references this Skill, mention that the next agent run will pick up the new version automatically — no `update_agent` needed. - -### Phase 5: Delete + orphan cleanup - -Use when the user wants to retire a Skill. **This is the most error-prone phase — follow every step.** - -The delete itself is one action; the orphan-cleanup pass is a separate, opt-in action that writes to other agents. Confirm them independently. - -1. **List referencing agents.** Use the reference-count technique from Phase 1 step 4 — paginate agents and inspect each `skills[]`. Capture this list now — once the Skill is deleted, resolving its `skill_id` back to a `display_name` gets harder. -2. **Warn and confirm the delete.** Show the user: - - The Skill's `display_name`, `skill_id`, and project scope - - The list of N agents that reference it (or "no referencing agents found") - - The orphan-reference behavior: agents will retain a dangling `skill_id` until pruned manually - Ask: *"Delete this Skill? (You'll be asked separately whether to prune orphan references on the N agents.)"* Default to **cancel** if the user hesitates. -3. **Delete.** On confirmation, call `delete_skill(skill_id=...)`. -4. **Offer the orphan-cleanup pass.** Only if step 1 found ≥1 referencing agents, ask: *"Prune the deleted Skill's id from these N agents' `skills[]` arrays now?"* This is a **second, explicit consent** — never auto-prune. - - If the user says **yes**, run the cleanup. The exact shape of `agent.skills[]` entries (id string vs `{ id }` object vs `{ key }` object) varies by agent schema version, so **mirror what `get_agent` returned** rather than assuming a shape: - ```text - skill_id = - for agent in referencing_agents: - current = get_agent(key=agent.key) - pruned = [ - entry for entry in current.skills - if extract_skill_id(entry) != skill_id # entry may be a string id or an object with id/key - ] - update_agent(key=agent.key, skills=pruned) - # verify: re-get and confirm skill_id is no longer present - ``` - Verify each `update_agent` returns success before moving on. - - If the user says **no** or wants to defer, summarize the orphan list and recommend they prune later (or hand off to whoever owns those agents). -5. **Report.** Summarize: Skill deleted; orphan cleanup either completed (N agents pruned) or skipped (N agents still reference the deleted id — list them). -6. Note in your reply *why* the extra step exists (auto-prune isn't on the platform yet, so the cleanup is manual and gated behind explicit consent). +2. **Patch fields explicitly.** Only send the fields being changed (`update_skill` is a patch — don't echo back unchanged tags or `instructions`). +3. **`display_name` rename — DANGER.** The `display_name` IS the lookup key for `{{snippet.}}`. Renaming it silently breaks every prompt or agent instruction that references the old name. Before sending a rename, run the reference scan from Phase 5 step 1 and warn the user. Offer to update the references in the same session. +4. **`enabled: false`** — flipping a Skill to disabled is the soft-retirement path. Existing references stop resolving (verify in your workspace what they render to — empty string, missing, or pass-through). Recommend this as the default first step when retiring; reserve `delete_skill` for actual cleanup. +5. **`instructions` changes:** if the user is rewriting the body, run a clarity pass first — reuse `optimize-prompt`'s heuristics (clarity, structure, no soft-constraint anti-patterns) but adapt; Skill `instructions` are typically shorter and capability-scoped, not full system prompts. +6. **Verify** by calling `get_skill` post-update and confirming the change landed. + +### Phase 5: Delete (with reference scan) + +Use when the user wants to permanently retire a Skill. **`delete_skill` is irreversible** and does not scrub `{{snippet.}}` references elsewhere — those references silently fail to resolve after delete. Always offer `enabled: false` first (Phase 4 step 4) and only proceed to delete when the user is sure. + +1. **Reference scan.** Find places that may reference the Skill by its `display_name`: + - Run `search_entities` to enumerate prompts, deployments, agents, and other Skills in the workspace. + - For each candidate, fetch its full body (`get_deployment` for deployments; `get_agent` for agents; `get_skill` for other Skills' `instructions`) and grep the body for `{{snippet.}}` (case-sensitive — match the Skill's exact `display_name`). + - Note: this scan can be expensive in large workspaces. Cache results within the session. + - If the user has a faster way to grep their workspace (e.g., a synced repo of prompts), prefer that. +2. **Warn and confirm.** Show the user: + - The Skill's `display_name`, `id`, project scope, and `enabled` state. + - The list of references found (or "no references found in scanned entities — but the scan only covers prompts/agents/Skills surfaced via `search_entities`; manual checks may be needed"). + - **The two-option choice:** *"(a) Soft-disable now (`enabled: false`) and revisit in N days, or (b) hard-delete and accept that any reference I missed will silently fail to render?"* Default to (a) when the scan found references; default to (b) only when the scan was comprehensive AND empty AND the user has confirmed. +3. **If the user picks delete:** call `delete_skill(skill_id=...)`. Confirm the API success. +4. **Report.** Summarize: Skill deleted (or disabled); references that the user should manually check or update; recommended follow-up if any. See [resources/known-caveats.md](resources/known-caveats.md) for the full caveat context. @@ -244,15 +218,16 @@ See [resources/known-caveats.md](resources/known-caveats.md) for the full caveat ## Done When -- The user's intent (list / get / create / update / delete) is fully resolved -- Any `delete_skill` was followed by an offered (and either completed or explicitly deferred) orphan-cleanup pass — never silently skipped -- `instructions` changes were sanity-checked for clarity and for prose-negation anti-patterns before save -- New or updated Skills have a non-empty `description`, at least one tag, an explicit project scope, and a sensible `path` -- The user has a clear pointer to where the Skill is wired in (`agent.skills[]` and/or `{{skill.}}`) — or a follow-up step to wire it +- The user's intent (list / get / create / update / disable / delete) is fully resolved. +- Any `delete_skill` was preceded by a reference scan AND an explicit choice to delete-rather-than-disable. +- `display_name` renames are gated behind a reference scan and the user understands the breakage risk. +- `instructions` changes were sanity-checked for clarity and for prose-negation anti-patterns before save. +- New or updated Skills have a non-empty `description`, at least one tag, an explicit project scope, and a sensible `path`. +- The user has a clear pointer to how the Skill is (or will be) consumed: `{{snippet.}}` inside prompts or agent instructions. ## Open in orq.ai - **Skills index:** [my.orq.ai](https://my.orq.ai/) → Skills -- **Agent skill bindings:** [my.orq.ai](https://my.orq.ai/) → Agents → (select agent) → Skills tab +- **Studio:** [my.orq.ai](https://my.orq.ai/) → Studio (Skills appear in the snippet/skill picker when authoring prompts and agents) When this skill conflicts with live API responses or docs.orq.ai, trust the API. diff --git a/skills/manage-skills/resources/authoring-guide.md b/skills/manage-skills/resources/authoring-guide.md index d1a2cd4..27fa052 100644 --- a/skills/manage-skills/resources/authoring-guide.md +++ b/skills/manage-skills/resources/authoring-guide.md @@ -1,60 +1,61 @@ # Authoring Guide: Display Name, Description, Tags, Project Scope, Path -How to author an orq.ai Skill so it's discoverable, scoped correctly, and picked up by the right agents. +How to author an orq.ai Skill so it's discoverable, scoped correctly, and renders cleanly wherever it's referenced. --- -## `display_name` +## `display_name` (the lookup key) -The Skill's `display_name` is the primary handle agents and humans use to refer to the Skill. It should be unambiguous on its own. +`display_name` is both the human-facing label AND the lookup key used by `{{snippet.}}` placeholders. Pick it carefully — renaming it after consumers exist silently breaks every reference. See [known-caveats.md](known-caveats.md). -**Platform constraints:** the API allows mixed case and underscores up to 255 characters (the canonical regex used by the platform is roughly `^[A-Za-z0-9]+(?:[_-][A-Za-z0-9]+)*$`). +**Platform constraints (enforced):** +- Regex: `^[A-Za-z0-9]+(?:[_-][A-Za-z0-9]+)*$` (alphanumeric with optional single dash/underscore separators) +- Max 255 characters +- Must be unique within the workspace — `create_skill` returns `AlreadyExists` on conflict -**This repo's recommended convention** (a stricter subset that keeps lists scannable): +**This repo's recommended convention** (a stricter subset that keeps lists scannable and placeholders readable): - **kebab-case**, lowercase, ASCII only — e.g., `extract-receipt-fields` -- **≤50 characters** — long names get truncated in agent configs and UI tables +- **≤50 characters** — long names get truncated in Studio tables and bloat placeholders - **Verb-noun preferred** — `summarize-ticket`, `classify-intent`, `extract-pii` - **Avoid generic verbs alone** — `handle-thing`, `do-task`, `process` say nothing -- **No version suffixes in the name** — `summarize-ticket-v2` is an anti-pattern; the platform stamps `version` on the Skill itself -- **Unique within scope** — names must be unique within a project (and across the workspace for workspace-wide Skills); use `POST /v2/skills:checkDisplayNameAvailability` before create +- **No version suffixes** — `summarize-ticket-v2` is an anti-pattern; treat the Skill itself as the unit of change and rely on the activity log for history -These are recommendations, not enforced by the API — diverge if a stronger convention already exists in the workspace, but stay consistent. +These are recommendations, not enforced by the API. Diverge if a stronger convention already exists in the workspace, but stay consistent. **Good (recommended convention):** -- `extract-invoice-line-items` +- `extract-invoice-line-items` → referenced as `{{snippet.extract-invoice-line-items}}` - `redact-pii-from-transcript` - `format-currency-eur` **Bad:** - `helper` (too vague) -- `the-skill-that-handles-customer-support-emails-with-tone-checking` (too long) -- `summarize-ticket-v2` (version belongs on the Skill, not in the name) +- `the-skill-that-handles-customer-support-emails-with-tone-checking` (too long; ugly in placeholders) +- `summarize-ticket-v2` (version belongs in the activity log) --- ## `description` -The `description` is what the **model** reads when deciding whether to apply the Skill. Optimize for retrieval, not for human marketing copy. +`description` is human-facing copy shown in the Studio's Skill picker and audit views. **It is not a runtime trigger** — Skills are inlined wherever a `{{snippet.}}` placeholder exists in a prompt/agent instruction; the model doesn't pick them based on description. **Rules:** -- **Lead with the trigger condition** — start with "Use when…" or "Apply when…" -- **Name the input and the output** — e.g., "Use when given a raw email body. Returns a JSON object with sender, subject, and intent." -- **One sentence.** Skills with paragraph descriptions get truncated in agent prompts. -- **Avoid implementation detail.** The model doesn't need to know which library you use. -- **Avoid "always" / "never" / "must"** — those are constraints, not triggers. Put hard rules in tool gates, not Skill descriptions. +- **One sentence.** Keep it scannable. +- **Lead with what the Skill does**, not how. Implementation detail belongs in `instructions`. +- **Mention the intended consumer** if it's not obvious from the name — e.g., "Reusable PII redaction block for customer-support agents." +- **Avoid "always" / "never" / "must"** — those are constraints, not descriptions. Hard rules belong in tool gates, not in description text. **Good:** -> Use when the user provides a receipt image or PDF. Extracts merchant, total, tax, and line items into structured JSON. +> Reusable receipt-extraction snippet — extracts merchant, total, tax, and line items into structured JSON. Inline in any prompt that processes receipt images or PDFs. **Bad:** > This skill is a powerful tool that helps you handle receipts in many different formats using OCR. -> *(no trigger, marketing voice, implementation leak)* +> *(no concrete output, marketing voice, implementation leak)* --- ## `tags` -Tags are how Skills get grouped in the UI and how callers narrow `list_skills` output **client-side** (`GET /v2/skills` does not accept a `tags` filter — paginate, then filter in memory). Good tagging makes a workspace navigable; bad tagging makes Skills invisible. +Tags group Skills in the Studio and let callers narrow `list_skills` output **client-side** (`GET /v2/skills` does not accept a `tags` filter — paginate, then filter in memory). Good tagging makes a workspace navigable; bad tagging makes Skills invisible. **Rules:** - **At least one tag.** Untagged Skills are easy to lose in long lists. @@ -62,7 +63,7 @@ Tags are how Skills get grouped in the UI and how callers narrow `list_skills` o - **Two axes of tagging are usually enough:** - **Functional** — what the Skill *does*: `extraction`, `summarization`, `classification`, `formatting`, `tone`, `policy` - **Domain** — where it applies: `finance`, `cs` (customer support), `legal`, `internal` -- **Avoid agent-specific tags.** A tag like `used-by-checkout-agent` becomes wrong the moment a second agent adopts the Skill — use `agent.skills[]` for that wiring instead. +- **Avoid consumer-specific tags.** A tag like `used-by-checkout-agent` becomes wrong the moment a second consumer adopts the Skill — use the reference scan in [governance-guide.md](governance-guide.md#finding-the-consumers-of-a-skill) to find consumers on demand. - **Lowercase, kebab-case** for consistency. **Recommended tag count:** 1–4 tags per Skill. More than 5 tags usually means the Skill is doing too many things. @@ -71,9 +72,9 @@ Tags are how Skills get grouped in the UI and how callers narrow `list_skills` o ## `project_id` (project scoping) -Every Skill is either **project-scoped** (`project_id` set to a project's id) or **workspace-wide** (`project_id` omitted). Workspace-wide Skills are visible to every agent across the workspace. +Every Skill is either **project-scoped** (`project_id` set to a project's id) or **workspace-wide** (`project_id` omitted). Workspace-wide Skills are visible to every consumer across the workspace. -**Default to project-scoped.** Workspace-wide Skills are shared infrastructure — every workspace member can see them, every agent can pull them in, and a bad edit affects everyone. +**Default to project-scoped.** Workspace-wide Skills are shared infrastructure — every workspace member can see them, every prompt can reference them, and a bad edit affects everyone. **When project-scoped is right:** - The Skill encodes project-specific business logic (e.g., a refund policy that only applies to the EU project) @@ -82,14 +83,14 @@ Every Skill is either **project-scoped** (`project_id` set to a project's id) or **When workspace-wide is right:** - The Skill is genuinely reusable across teams and projects (e.g., `redact-pii`, `format-currency`) -- The Skill has stabilized — at least one minor version, used by ≥2 agents, no recent breaking changes +- The Skill has stabilized — no recent breaking changes, used by ≥2 consumers - Ownership is clear (named owner in the description or `owner:` tag) **How to choose:** 1. Start project-scoped (set `project_id`). -2. After the Skill has been stable for ≥2 weeks and used by ≥2 agents in the same project, ask: "would another project benefit from this?" -3. If yes, **create a copy** with `project_id` omitted (workspace-wide). Don't move — agents in the original project still reference the project-scoped `skill_id`. Sunset the original after agents are re-wired. +2. After the Skill has been stable for ≥2 weeks and used by ≥2 consumers in the same project, ask: "would another project benefit from this?" +3. If yes, **create a copy** with `project_id` omitted (workspace-wide). Don't move — existing references still point at the project-scoped `display_name`. Sunset the original after consumers are re-pointed. > **Resolving project keys → ids:** if the user gives you a project key/name, run `search_directories` to convert it to the `project_id` value the API expects. @@ -97,19 +98,31 @@ Every Skill is either **project-scoped** (`project_id` set to a project's id) or ## `path` -`path` is the finder-style location of the Skill inside its project (e.g., `Default/Skills`, `cs/policies`, `finance/extraction`). It controls where the Skill appears in the UI's folder tree. +`path` is the finder-style location of the Skill inside its project (e.g., `Default/Skills`, `cs/policies`, `finance/extraction`). It controls where the Skill appears in the Studio's folder tree. **Rules:** - **Default to the project's standard Skill folder** (often `Default/Skills`) unless the team has an explicit folder convention. -- **Mirror existing folders.** Paginate `list_skills` and reuse paths already in the target project — divergent paths fragment the UI. +- **Mirror existing folders.** Paginate `list_skills` and reuse paths already in the target project — divergent paths fragment the Studio. - **Use slashes, not backslashes**, and keep segment names short and descriptive. - **Group by purpose, not by owner.** Folder-by-team becomes wrong the moment a Skill moves teams; folder-by-purpose ages better. --- +## `enabled` + +`enabled` is a boolean that defaults to `true` on create. When `false`, the Skill is preserved in the workspace but `{{snippet.}}` references stop resolving (verify the exact render behavior in your workspace — empty, pass-through, or skip). + +**When to seed `enabled: false`:** +- You're staging the Skill for review before any consumer points at it. +- You're setting up parallel versions for a controlled cutover. + +In practice you almost always create with `enabled: true` (the default) and use `enabled` later as the soft-retirement lever (see [governance-guide.md](governance-guide.md#retire)). + +--- + ## `instructions` (the Skill body) -`instructions` is the actual content the agent reads (and what `{{skill.}}` inlines into prompts). Keep it: +`instructions` is the actual content that gets inlined wherever the Skill is referenced. Keep it: - **Focused on one capability.** If you find yourself writing "and also…", split into two Skills. - **Specific.** Include 1–2 input/output examples. - **Free of hard constraints expressed as prose.** Don't write "NEVER do X" or "you MUST refuse Y" — those are soft hints, not enforcement. See [known-caveats.md](known-caveats.md#anti-pattern-never-prose-constraints-in-instructions). diff --git a/skills/manage-skills/resources/governance-guide.md b/skills/manage-skills/resources/governance-guide.md index 6234187..dae65d3 100644 --- a/skills/manage-skills/resources/governance-guide.md +++ b/skills/manage-skills/resources/governance-guide.md @@ -1,73 +1,48 @@ -# Governance Guide: Wiring Skills to Agents, Ownership, Lifecycle +# Governance Guide: Consumption, Ownership, Lifecycle -How Skills get attached to agents, who owns them, and how they retire. +How Skills get consumed, who owns them, and how they retire. --- -## Two ways a Skill gets used +## How a Skill gets consumed -A platform Skill can reach the model in two ways. Pick (or combine) deliberately: +A platform Skill reaches the model in exactly one way: as a **template placeholder inside a prompt or agent instruction**. Anywhere a prompt template is rendered (deployments, agent system prompts, other Skills), the placeholder -1. **`agent.skills[]` — runtime selection.** The Skill is attached to an agent. The model sees its `description` (and a stub) and decides at runtime whether to apply it. Good for capability-shaped Skills the agent might or might not need on a given turn. -2. **`{{skill.}}` — static template inlining.** Anywhere a prompt template is rendered (deployments, agent system prompts, snippets), `{{skill.}}` expands to the Skill's `instructions` at render time. Good for shared instruction blocks that should always be present in a particular prompt. - -The same Skill can be consumed both ways — they don't conflict. - ---- - -## Wiring Skills to Agents (`agent.skills[]`) - -Skills don't fire on their own (in the runtime-selection mode) — an agent has to reference them. The reference lives on the agent in the `skills[]` array. - -### Adding a Skill to an agent - -1. `get_agent(key=)` — capture the current `skills[]` list and inspect its entry shape. -2. Append the new Skill's reference to the list, **mirroring the existing entry shape** (entries may be plain `skill_id` strings or objects with `id`/`key` fields depending on the agent schema version — pattern-match what `get_agent` returned). -3. `update_agent(key=, skills=)`. -4. Verify with `get_agent` that the new entry is present. - -> **Schema note:** The exact shape of `skills[]` entries varies by agent schema version. Always read first, mirror the shape on write. Hard-coding `{ id: ... }` or a bare string can silently corrupt the agent config in workspaces using the other shape. - -### Removing a Skill from an agent +```text +{{snippet.}} +``` -Same pattern, but filter out the unwanted Skill before `update_agent`. **This is also the workaround for the orphan-reference behavior** when a Skill is deleted — see [known-caveats.md](known-caveats.md). +is replaced with the Skill's `instructions` at render time. The `snippet.` prefix is a backwards-compatibility holdover from when the entity was called Prompt Snippets — there is no `{{skill.<...>}}` equivalent today. -### Bulk wiring +You can chain references: a Skill's `instructions` may itself contain `{{snippet.}}` placeholders that the renderer expands recursively. -If a Skill needs to attach to many agents at once, list candidates first with `search_entities(type: "agent")`, ask the user to confirm the list, then iterate. Never blanket-attach without explicit confirmation — Skills change agent behavior in ways that are hard to roll back without trace analysis. +**Implications:** +- The `display_name` is load-bearing. Renaming a Skill changes the lookup key and silently breaks every existing reference. See [known-caveats.md](known-caveats.md). +- There is no per-agent attachment list. The relationship between an agent and a Skill is implicit — it lives in the agent's `instructions` text, not in a structured `skills[]` array on the agent. (The `AgentCard.skills` field is unrelated AI-generated capability metadata. See [known-caveats.md](known-caveats.md).) +- Disabling a Skill (`enabled: false`) takes effect at render time; existing references stop pulling in the `instructions`. Verify the exact behavior in your workspace — it may render to empty, pass-through, or skip silently. --- -## Inlining Skills in prompts (`{{skill.}}`) +## Finding the consumers of a Skill -For a Skill that should always be present in a particular prompt — a brand voice block, a refund policy snippet, a formatting rule — reference it by key inside the prompt template: +There is no `list_consumers(skill_id)` API. To find every place a Skill is referenced, you have to text-search the rendered surface: ```text -You are a customer-support assistant. - -{{skill.brand-voice}} - -{{skill.refund-policy-eu}} - -User: {{message}} +1. Enumerate candidates with search_entities (deployments, prompts, agents, other Skills). +2. For each candidate, fetch its full body with the appropriate get_* tool. +3. Substring-match {{snippet.}} (case-sensitive) in the body. +4. Collect the matches. ``` -At render time, the placeholder is replaced with the Skill's `instructions`. Updating the Skill updates every prompt that inlines it — useful for shared infrastructure, dangerous if you forget which prompts depend on it. - -**Audit pattern:** when editing a workspace-wide Skill's `instructions`, search prompts/deployments for `{{skill.}}` to find the blast radius before saving. - ---- - -## How agents select Skills at runtime - -When a Skill is wired via `agent.skills[]`, the model picks Skills based on the Skill **description** — *not* the name or tags. This is why authoring guidance pushes "Use when…" descriptions: they're the retrieval surface. +**When to do the scan:** +- Before `delete_skill` (mandatory). +- Before renaming `display_name` (mandatory). +- When the user asks "where is this Skill used?" +- When auditing workspace-wide Skills for ownership / sunset candidates. -Implications: -- A great `instructions` body with a vague description will rarely fire. -- Two Skills with similar descriptions cause the model to pick non-deterministically. -- Wiring 20+ Skills to a single agent dilutes the model's selection accuracy — keep `skills[]` lean. - -**Rule of thumb:** ≤8 Skills per agent. If you need more, the agent is probably doing too many things — split it. +**Cost considerations:** +- The scan is O(N entities × body size). Cache results within the session. +- For large workspaces, prefer a synced repo grep if the team has one — much cheaper than fanning out HTTP requests. --- @@ -78,51 +53,53 @@ There is no first-class "owner" field on a Skill today. Establish ownership conv - **Tag** — add an `owner:` tag (e.g., `owner:cs-team`) to workspace-wide Skills. - **Description** — for project-scoped Skills, ownership is implicit in the project. For workspace-wide, mention the owning team in the description's trailing context if it matters for incident response. -Audit unowned workspace-wide Skills periodically (paginate `list_skills`, filter `project_id is None` client-side, then look for missing `owner:` tags) — anything without one is a candidate for review. +Audit unowned workspace-wide Skills periodically: paginate `list_skills`, filter `project_id is None` client-side, then look for missing `owner:` tags. --- ## Lifecycle: Create → Iterate → Stabilize → Retire ### Create -Always start project-scoped (set `project_id`). Describe the trigger precisely. Wire to one agent first and verify in traces. +Always start project-scoped (set `project_id`). Wire one consumer first (a single deployment or agent instruction) and verify the rendered output before broadening. ### Iterate -- Iterate on `instructions` and `description`, not `display_name`. Name changes break references in prompts and in any user docs. +- Iterate on `instructions` and `description`, not `display_name`. **Renaming `display_name` breaks every `{{snippet.}}` reference.** See [known-caveats.md](known-caveats.md) for the rename workflow. - Sanity-check `instructions` rewrites (clarity, structure, no prose-negation anti-patterns) — see `optimize-prompt` for prose heuristics, but apply judgment: Skill `instructions` are usually shorter and more capability-scoped than a system prompt. -- After each meaningful change, run `run-experiment` against the agent that uses the Skill to confirm the change improves (or at least doesn't regress) behavior. +- After each meaningful change, run `run-experiment` against an agent or deployment that consumes the Skill to confirm the change improves (or at least doesn't regress) behavior. ### Stabilize A Skill is stable when: - It hasn't had an `instructions` change in ≥2 weeks -- It's referenced by ≥2 agents (or 1 production agent), or inlined in ≥2 prompts +- It's referenced by ≥2 prompts/agents - No open incidents tag the Skill as a contributor -At that point, consider promoting to workspace-wide if it's broadly reusable. See [authoring-guide.md](authoring-guide.md#project-scoping). +At that point, consider promoting to workspace-wide if it's broadly reusable. See [authoring-guide.md](authoring-guide.md#project_id-project-scoping). ### Retire Retire a Skill when: -- The agent(s) using it are decommissioned, OR +- The prompts/agents using it are decommissioned, OR - A replacement Skill covers the same capability better **Retirement workflow:** -1. Identify all referencing agents (`search_entities(type: "agent")` + per-agent `get_agent` fanout) AND all prompts inlining `{{skill.}}`. -2. For each consumer, decide: replace (swap in the new Skill) or remove (no replacement needed). -3. Wire replacements before deleting the old Skill, not after — atomicity matters. -4. Run `delete_skill`. -5. Run the orphan-cleanup pass on every referencing agent (see SKILL.md Phase 5). +1. Run the reference scan (above) to identify every consumer of the Skill. +2. Decide per consumer: replace (point them at the new Skill name) or remove (drop the placeholder). +3. **Disable first, delete later.** Set `enabled: false` and wait at least one full traffic cycle (a day, a week — depends on how the prompts run). If nothing breaks, proceed to delete. If something breaks, re-enable, investigate, fix the missed reference. +4. **Wire replacements before deleting**, not after — atomicity matters. +5. Run `delete_skill`. 6. Note retirement in the workspace changelog if your team keeps one. +The platform records a semantic-version *activity log entry* on each create/update (visible in the Skill's history view), but there is no `version` field on the Skill object — don't invent a versioning workflow that pretends otherwise. + --- ## Audit checklist Periodic Skills audit (suggested quarterly): -- [ ] Any workspace-wide Skill with no `owner:` tag? — assign or move to project-scoped -- [ ] Any Skill not referenced by any agent and not inlined in any prompt? — candidate for deletion (or future intent — confirm with owner) -- [ ] Any agent with a `skill_id` in `skills[]` that no longer resolves? — orphan from a past delete, prune via `update_agent` -- [ ] Any agent with >8 entries in `skills[]`? — agent overload, consider splitting -- [ ] Any two Skills with near-duplicate descriptions? — selection ambiguity, consolidate -- [ ] Any Skill `instructions` containing `NEVER`, `MUST NOT`, or "you must refuse"? — prose-negation anti-pattern, replace with MCP tool gate (see [known-caveats.md](known-caveats.md#anti-pattern-never-prose-constraints-in-instructions)) +- [ ] Any workspace-wide Skill with no `owner:` tag? — assign or move to project-scoped. +- [ ] Any Skill with no `{{snippet.}}` references in scanned entities? — candidate for `enabled: false`, then deletion. +- [ ] Any Skill with `enabled: false` for >30 days and no recent toggles? — candidate for deletion. +- [ ] Any prompt/agent instruction with a `{{snippet.}}` placeholder whose target Skill no longer exists? — orphan reference; either restore the Skill, point the placeholder at a replacement, or remove the placeholder. +- [ ] Any two Skills with near-duplicate `instructions`? — consolidate; rename references. +- [ ] Any Skill `instructions` containing `NEVER`, `MUST NOT`, or "you must refuse"? — prose-negation anti-pattern, replace with MCP tool gate (see [known-caveats.md](known-caveats.md#anti-pattern-never-prose-constraints-in-instructions)). diff --git a/skills/manage-skills/resources/known-caveats.md b/skills/manage-skills/resources/known-caveats.md index a2ba9ac..f87cb0b 100644 --- a/skills/manage-skills/resources/known-caveats.md +++ b/skills/manage-skills/resources/known-caveats.md @@ -1,84 +1,81 @@ # Known Caveats and Anti-Patterns -Active platform behaviors and authoring anti-patterns to handle until they're addressed upstream. +Active platform behaviors and authoring anti-patterns to handle while working with Skills. --- -## Orphaned `agent.skills[]` references after delete +## `delete_skill` does not scrub `{{snippet.}}` references -**Status:** Manual cleanup required +**Status:** Manual reference scan required ### Symptom -After calling `delete_skill(skill_id=X)` (or `DELETE /v2/skills/{X}`), agents that referenced the deleted Skill still have its `skill_id` in their `agent.skills[]` array. The platform does not auto-prune. +`delete_skill` removes the Skill entity from the workspace. It does **not** rewrite or null out `{{snippet.}}` placeholders that were referencing the deleted Skill from elsewhere — other Skills' `instructions`, deployment prompt templates, agent instructions, etc. -At runtime, those dangling ids: -- Are silently ignored in some agent versions (best case) -- Cause "skill not found" errors in agent runs (worst case) +After the delete, any leftover `{{snippet.}}` placeholder will silently render to empty / pass-through (the exact behavior depends on the workspace's template engine and excluded-prefix configuration). The result is a prompt that looks correct but is missing a chunk of intended content. There is no error, no log, no UI banner — just a silently degraded prompt. -Either way, the agent config drifts out of sync with reality and the orphan accumulates until manually cleaned. - -### Workaround (mandatory) +### Workaround -Always pair `delete_skill` with an orphan-cleanup pass: +**Always run a reference scan before `delete_skill`**, and prefer `enabled: false` (soft disable) as a first step: ```text -skill_id = -referencing_agents = -# Compute via search_entities + per-agent get_agent fanout if search_entities -# does not return skills[] in its summary payload (verify in the workspace). - -for agent in referencing_agents: - current = get_agent(key=agent.key) - pruned = [ - entry for entry in current.skills - if extract_skill_id(entry) != skill_id # mirror whatever shape get_agent returned - ] - update_agent(key=agent.key, skills=pruned) - # verify: re-get and confirm skill_id is gone +# 1. Enumerate candidate consumers +candidates = search_entities() # prompts, deployments, agents, other Skills + +# 2. For each candidate, fetch its full body and look for the placeholder +references = [] +for entity in candidates: + body = fetch_full_body(entity) # get_deployment / get_agent / get_skill etc. + if f"{{{{snippet.{skill.display_name}}}}}" in body: # case-sensitive substring + references.append(entity) + +# 3. Show references to the user; default to soft-disable when any are found. ``` Key points: -- **Identify the references *before* deletion.** Once the Skill is gone, you can't always resolve its `skill_id` back to its `display_name`; record the agents while the Skill still exists. -- **Mirror the agent's `skills[]` entry shape.** Entries may be plain id strings or objects with `id`/`key` fields depending on the agent schema version. Always pattern-match what `get_agent` returned and write back the same shape. -- **Verify every `update_agent`.** A failed prune leaves a permanent orphan. -- **Don't blanket-update all agents** — only those that actually had the reference. Touching unrelated agents inflates the audit log and can race with other authors' edits. +- **Match `display_name` exactly.** The placeholder is case-sensitive; substring-matching `display_name` casually can produce false positives if names overlap. +- **`search_entities` is not exhaustive.** It surfaces what the orq workspace indexes; downstream consumers (external apps that pull prompts via the API and inline them themselves) are invisible to it. If the team has a synced repo of prompts, grep there too. +- **Soft-disable first.** Setting `enabled: false` is reversible; `delete_skill` is not. Disabling preserves the Skill so a missed reference can be diagnosed by enabling it again. ### When this gets fixed -When `delete_skill` returns a response that includes the list of agents it pruned (or the docs explicitly state auto-prune is now in place), the workaround can be removed. Until then, treat the workaround as part of the contract of `delete_skill`. +When the platform either (a) returns a list of identified references on `delete_skill`, or (b) refuses delete while references exist, the workaround can be relaxed to "trust the API." Until then, the reference scan is part of the contract of `delete_skill`. --- -## Empty `version` on migrated Skills +## Renaming `display_name` silently breaks `{{snippet.}}` references -**Status:** Handle defensively +**Status:** Same root cause as delete; same workaround ### Symptom -Skills created through the Snippet→Skill migration may have an empty `version` field (empty string or `null`) instead of an integer. - -Programmatic readers that assume non-empty / numeric `version` will crash, mis-sort, or skip these Skills entirely. +`update_skill` accepts a new `display_name`. The Skill is renamed in place. Every prompt or agent instruction that referenced the old name via `{{snippet.}}` continues to render, but now resolves to nothing — the same silent-empty failure mode as a deleted Skill. ### Workaround -Treat `version` as **optional / valid-when-empty**, not an error. +Treat a rename as if it were a delete + create: -```text -version = skill.get("version") or None -# display as "(unset)" or "—" in UI -# do not crash on string ops; do not assume integer semver -``` +1. Run the same reference scan as the delete workflow. +2. Show the user the references and ask whether to: + - Cancel the rename, OR + - Proceed with the rename AND fan out updates to every reference in the same session, OR + - Proceed with the rename AND accept the silent breakage (rare; only OK when the scan was exhaustive and empty). -- **When reading:** coerce empty → `None` (or your sentinel). -- **When displaying:** show `(unset)` rather than blank — surfaces the migration footprint so users know which Skills came through the migration. -- **When updating:** never send `version` in `update_skill` — it is stamped server-side. A successful update typically populates `version` going forward. -- **When filtering / sorting:** never assume `version` is a comparable integer. Treat `None` consistently (last, first, or excluded — pick one and stick to it). -- **Audit pattern:** paginate `list_skills` and filter where `version is None` to surface the migration backlog so it can be backfilled by a workspace owner. +--- -### When this gets fixed +## A2A `AgentCard.skills` is not a list of Skill references + +**Status:** Naming overlap — not a bug + +### Symptom + +When inspecting an agent via `get_agent`, the response includes a `skills[]` array. This is **not** a list of platform Skill ids. It's the AI-generated A2A `AgentCardSkill[]` array — capability descriptors generated from the agent's role/description/instructions for the A2A AgentCard. + +### Why it matters -When the docs say Snippet-migrated Skills are backfilled with stamped `version` values, the defensive coercion can be removed. Until then, keep it on. +- Don't try to "wire" a platform Skill to an agent by appending its id to `agent.skills[]`. That field is regenerated from the agent manifest and your edit will be lost (or silently ignored). +- Don't try to "find agents that reference a Skill" by scanning `agent.skills[]` for the Skill's id. The field doesn't carry that information. +- The actual relationship is **text references** to `{{snippet.}}` inside `agent.instructions`. To find consumers, run the reference scan above. --- diff --git a/tests/skills.md b/tests/skills.md index ab9b17b..32c9d62 100644 --- a/tests/skills.md +++ b/tests/skills.md @@ -147,37 +147,47 @@ Requires `setup.md` to have run first (seed data for `run-experiment` test). - Ask: "Show me the Skills in my workspace" - Verify: calls `list_skills` (or REST `GET /v2/skills` fallback) and **paginates to completion** (cursor-based — `limit`, `starting_after`, `ending_before`) - Verify: any user-requested filter (project, tags, name substring) is applied **client-side** after pagination — does NOT pass `project_id`/`tags`/`q` to `list_skills` (the endpoint does not accept them) -- Verify: presents `display_name`, project scope, `tags`, `path`, and `version` per Skill -- Verify: does NOT crash on Skills with empty `version` (Snippet→Skill migration leftover) — surfaces as `(unset)` +- Verify: presents `display_name`, project scope, `tags`, `path`, and `enabled` state per Skill +- Verify: does NOT claim a `version` field on the Skill (none exists in the schema) +- Verify: does NOT compute reference counts eagerly — defers them as on-demand work ### Scenario 2: Create skill (authoring guidance) - Ask: "Create a Skill called `extract-receipt-fields`" - Verify Phase 3: asks for `description`, `tags`, `project_id` (default project-scoped, not workspace-wide), and `path` -- Verify: rejects or flags descriptions that don't start with "Use when…" or describe a trigger - Verify: warns if the proposed `instructions` contain `+NEVER+` / "you MUST refuse" prose constraints and recommends an MCP tool gate instead -- Verify: checks name uniqueness via `POST /v2/skills:checkDisplayNameAvailability` when available, with a paginated `list_skills` scan only as a fallback -- Verify: `create_skill` payload uses `display_name` and `instructions` (not `name` / `body` / `doc`) +- Verify: does NOT call a fictional `:checkDisplayNameAvailability` endpoint — instead, calls `create_skill` and handles `AlreadyExists` if the name is taken +- Verify: `create_skill` payload uses `display_name` and `instructions` (not `name` / `body` / `doc`); includes `enabled` only if user requested non-default +- Verify: echoes back the consumption pattern after create — `{{snippet.}}`, NOT `{{skill.<...>}}` -### Scenario 3: Delete skill — orphan handling +### Scenario 3: Delete skill — reference scan -- Provide context: a Skill that's referenced by 2 agents +- Provide context: a Skill referenced by 2 prompts via `{{snippet.}}` - Ask: "Delete this Skill" -- Verify: identifies referencing agents BEFORE deletion (via `search_entities(type: "agent")` plus per-agent `get_agent` fanout if needed) -- Verify: warns user about the orphan-reference behavior (referencing agents are not auto-pruned by `delete_skill`) -- Verify: gets explicit consent for delete, then a SECOND explicit consent for the orphan-cleanup pass -- Verify: never auto-prunes `agent.skills[]` without consent -- Verify: after consent, calls `get_agent` + `update_agent` per agent, mirroring the existing entry shape (string `skill_id` vs object), and verifies each prune -- Verify: final report lists what was deleted and what was pruned (or skipped) +- Verify: runs a reference scan BEFORE deletion (`search_entities` then per-entity body fetch with `get_deployment` / `get_agent` / `get_skill`, substring-matching `{{snippet.}}` case-sensitively) +- Verify: surfaces the references found and offers `enabled: false` (soft disable) as the default first step +- Verify: does NOT call `update_agent` to "prune" `agent.skills[]` — that field is unrelated A2A AgentCard metadata +- Verify: never auto-deletes; always requires explicit consent after the user has seen the reference list +- Verify: final report lists what was deleted (or disabled) and any references the user should manually update -### Scenario 4: Update skill (no blind overwrite) +### Scenario 4: Update skill (no blind overwrite, rename warning) - Ask: "Update the description of the `refund-policy` Skill" - Verify: calls `get_skill(skill_id=...)` first, shows the user the current state - Verify: only patches the changed field — does not echo back unchanged `tags`/`instructions` -- Verify: does NOT pass `version` in `update_skill` (it's stamped server-side) +- Verify: does NOT pass `version` in `update_skill` (no such field on the schema) - Verify: confirms the diff with the user before `update_skill` -- Verify Phase 4: when rewriting `instructions`, applies clarity heuristics from `optimize-prompt` (does not blindly delegate — Skill `instructions` are typically shorter than a full system prompt) +- Then ask: "Rename `refund-policy` to `refund-policy-eu`" +- Verify: warns that renaming `display_name` silently breaks every `{{snippet.refund-policy}}` reference and runs the reference scan before sending the rename +- Verify: when rewriting `instructions`, applies clarity heuristics from `optimize-prompt` rather than blindly delegating + +### Scenario 5: Failure-mode handling + +- Ask: "Create a Skill called `refund-policy`" (in a workspace that already has one) +- Verify: handles `AlreadyExists` gracefully — surfaces the conflicting Skill and offers either a renamed create or `update_skill` +- Ask: "Disable the `refund-policy` Skill" +- Verify: routes to Phase 4 with `enabled: false`, NOT to Phase 5 (delete) +- Verify: explains that disable is reversible and references stop resolving until re-enabled --- From cc3e82cf620477955e9dae36a93d5cbb4aadc769 Mon Sep 17 00:00:00 2001 From: Karina Barbara Kalicka-Molin Date: Fri, 15 May 2026 15:28:51 +0200 Subject: [PATCH 4/5] fix(manage-skills cmd): add missing name: frontmatter field --- commands/manage-skills.md | 1 + 1 file changed, 1 insertion(+) diff --git a/commands/manage-skills.md b/commands/manage-skills.md index 4bd26c0..277dc2a 100644 --- a/commands/manage-skills.md +++ b/commands/manage-skills.md @@ -1,4 +1,5 @@ --- +name: manage-skills description: Manage orq.ai Skills — list, get, create, update, disable, or delete Skills (the platform entity, formerly Snippets) and find the prompts/agents that reference them argument-hint: [list|get|create|update|disable|delete] [name-or-id] allowed-tools: AskUserQuestion, mcp__orq-workspace__list_skills, mcp__orq-workspace__get_skill, mcp__orq-workspace__create_skill, mcp__orq-workspace__update_skill, mcp__orq-workspace__delete_skill, mcp__orq-workspace__search_entities, mcp__orq-workspace__get_deployment, mcp__orq-workspace__get_agent From e1b9f40878abd892232a8caab1e44aba25dc2753 Mon Sep 17 00:00:00 2001 From: Karina Barbara Kalicka-Molin Date: Fri, 15 May 2026 15:56:28 +0200 Subject: [PATCH 5/5] docs(manage-skills): add production-readiness notice + renderer-wiring caveat Verified end-to-end against orquesta-web that two soft claims in the prior commit needed harder hedges: 1. The /v2/skills CRUD API and the *_skill MCP tools live only on the feature branch origin/thedevtoni/snippets-to-skills (a677849318, 19c26518f4, eedf87c1ca, c13f227567). They are NOT on origin/main yet. A workspace pinned to a release without that branch merged will not have list_skills / create_skill / etc. Added a preflight check at the top of Prerequisites that probes list_skills once and falls back to managing the entity under its legacy name (Prompt Snippet via /v2/prompts/snippets) when the new tools are missing. 2. The {{snippet.}} renderer reads from the PROMPT_SNIPPETS_KV Redis cache (libs/go/response-executor/snippets.go, libs/platform/prompts/prompts-manager/src/lib/prompts-manager.ts). That cache is populated only by the legacy apps/workspaces-api/src/handlers/prompts-snippets/* handlers. The new apps/platform-api/skills/connect_routes.go writes to MongoDB, records a semantic-version activity, and publishes a NATS entity event -- but nothing in the codebase consumes that event to update the snippet KV. The only NATS consumer (apps/responses-api/consumers/entity_event_consumer.go) handles agent.deleted only. So a Skill created via the new API may not be reachable via {{snippet.}} until a bridge lands. Added a "Renderer wiring lag" caveat in known-caveats.md plus a test-render verification step, and softened the enabled-field row in the SKILL.md field reference to flag the same gap. Both findings come from tracing the resolver code; if the wiring lands later (NATS subscriber that mirrors skill.* events into the KV, or the resolver moved to read MongoDB directly), the caveats can be removed. --- skills/manage-skills/SKILL.md | 22 ++++++++++++---- .../manage-skills/resources/known-caveats.md | 26 +++++++++++++++++++ 2 files changed, 43 insertions(+), 5 deletions(-) diff --git a/skills/manage-skills/SKILL.md b/skills/manage-skills/SKILL.md index 23a3a93..3d4442a 100644 --- a/skills/manage-skills/SKILL.md +++ b/skills/manage-skills/SKILL.md @@ -12,7 +12,16 @@ allowed-tools: Bash, Read, Write, Edit, Grep, Glob, WebFetch, Task, AskUserQuest # Manage Skills -You are an **orq.ai Skills lifecycle specialist**. Your job is the full CRUD workflow for the **Skills entity on the orq.ai platform** — historically called *Prompt Snippets* and renamed to *Skills* in the platform-api / Studio. Skills are modular, reusable instruction blocks that get inlined into prompts and agent instructions via the `{{snippet.}}` template placeholder. (The placeholder kept the legacy `snippet.` prefix for backwards compatibility.) +You are an **orq.ai Skills lifecycle specialist**. Your job is the full CRUD workflow for the **Skills entity on the orq.ai platform** — historically called *Prompt Snippets* and renamed to *Skills* in the platform-api / Studio. Skills are modular, reusable instruction blocks intended to be inlined into prompts and agent instructions via the `{{snippet.}}` template placeholder. (The placeholder kept the legacy `snippet.` prefix for backwards compatibility.) + +## Production-readiness notice (verify before relying on this skill) + +The Skills entity (`/v2/skills` REST + `*_skill` MCP tools) is delivered on a backend feature branch and may not be available in every workspace yet. **Run the preflight check** in [Prerequisites](#prerequisites) before any phase. If the MCP `*_skill` tools or the REST endpoints are missing, fall back to managing Prompt Snippets via the legacy `/v2/prompts/snippets` endpoints — the *entity is the same*, just under the older name. + +Two known wiring gaps to surface to the user when relevant: + +1. **Snippet→Skill migration is one-way and asynchronous.** Existing Prompt Snippets are migrated to Skills via a backend cronjob; Skills created via the new API are *not* back-propagated to the snippet representation. +2. **Renderer wiring may lag.** The `{{snippet.}}` template resolver reads from a Redis cache that has historically been populated by the legacy snippet handlers. Whether Skills created through the new API land in that cache depends on whether the entity-event subscriber that bridges them is live in the user's workspace. **Always verify in a test prompt that a newly created Skill actually renders before promoting it to production.** ## Disambiguation: which "Skill" are we talking about? @@ -111,7 +120,7 @@ tagged_skills = [s for s in all_skills if "policy" in s.tags] | `path` | create / update / read | Finder-style location, e.g. `Default/Skills` or `cs/policies`. Defaults to project's default skill folder. | | `project_id` | create / update / read | Optional — omit for workspace-wide. | | `instructions` | create / update / read | The actual Skill body — modular markdown that gets inlined wherever the Skill is referenced. | -| `enabled` | create / update / read | Boolean. When `false`, `{{snippet.}}` references resolve to empty/skipped (verify behavior in your workspace). Useful as a soft-disable before delete. | +| `enabled` | create / update / read | Boolean (default `true`). Whether `{{snippet.}}` references for a disabled Skill render to empty/pass-through depends on workspace renderer wiring (verified at the time of writing: the resolver reads from the legacy snippet KV cache, which has no notion of `enabled`; behavior may change). Treat `enabled: false` as a soft-disable signal in the API and audit log; verify the actual render effect before relying on it. | | `skill_id` | read / update / delete | Server-generated id. **The list/get response surfaces it as `id`** but the update/delete inputs take it as `skill_id`. Same value. | | `workspace_id` | read only | Audit. | | `created_at`, `updated_at`, `created_by_id`, `updated_by_id` | read only | Audit metadata. | @@ -128,9 +137,12 @@ tagged_skills = [s for s in all_skills if "policy" in s.tags] ## Prerequisites -- The orq.ai MCP server is connected (run the `quickstart` skill / `/orq:quickstart` to verify in Claude Code, or the equivalent onboarding flow in your assistant) -- `ORQ_API_KEY` is set -- The user knows which **project** the Skill belongs to (run `search_directories` if not) +- The orq.ai MCP server is connected (run the `quickstart` skill / `/orq:quickstart` to verify in Claude Code, or the equivalent onboarding flow in your assistant). +- `ORQ_API_KEY` is set. +- The user knows which **project** the Skill belongs to (run `search_directories` if not). +- **Preflight: confirm the Skills API is available.** Try `list_skills` once at session start. If the tool is unknown OR returns "method not found" against `/v2/skills`, the workspace's backend doesn't expose the new entity yet. Two options: + 1. Tell the user and fall back to managing the entity under its legacy name (Prompt Snippet via `/v2/prompts/snippets`). + 2. Ask the user whether to proceed anyway against any partial endpoints they have. --- diff --git a/skills/manage-skills/resources/known-caveats.md b/skills/manage-skills/resources/known-caveats.md index f87cb0b..a80932b 100644 --- a/skills/manage-skills/resources/known-caveats.md +++ b/skills/manage-skills/resources/known-caveats.md @@ -4,6 +4,32 @@ Active platform behaviors and authoring anti-patterns to handle while working wi --- +## Renderer wiring lag — verify in a test prompt before relying on a new Skill + +**Status:** Verify per workspace + +### Symptom + +The `{{snippet.}}` template placeholder is resolved by a Redis-backed snippet cache (`PROMPT_SNIPPETS_KV`). Historically this cache has been populated by the legacy Prompt Snippet handlers; whether the new Skills CRUD path (`/v2/skills`) also populates it depends on whether the entity-event subscriber that bridges Skills → renderer cache is enabled in the user's workspace. + +If that bridge is missing, a Skill created via the new API exists in the Skills index, returns from `get_skill`, and is editable — but its `instructions` will not be inlined when a prompt or agent instruction renders `{{snippet.}}`. + +### Workaround + +After creating or substantively editing a Skill, run a single test render before broadcasting the Skill to other consumers: + +1. Create a one-off prompt/deployment/agent that contains only `{{snippet.}}` (and optionally a delimiter). +2. Invoke it. +3. Confirm the rendered output contains the Skill's `instructions`. + +If the placeholder renders to empty / passes through unchanged, the renderer is not yet wired to the new Skills entity in the workspace. Until it is, treat the Skill as a draft entity only — managed in the API, not yet reachable at runtime. + +### When this gets resolved + +When a backend change-stream consumer or NATS subscriber lands that mirrors `skill.created` / `skill.updated` / `skill.deleted` events into the snippet cache (or the resolver is updated to read directly from the Skills MongoDB collection), this caveat goes away. Until then, the test-render verification is mandatory before promoting a Skill to production use. + +--- + ## `delete_skill` does not scrub `{{snippet.}}` references **Status:** Manual reference scan required