Skip to content

Add system-prompt-optimizer and synthetic-dataset-v2 skills#1

Merged
Baukebrenninkmeijer merged 2 commits into
mainfrom
aminaakhmedova/res-355-implement-mcp-skills-in-orqkit-for-mcp-usage
Mar 13, 2026
Merged

Add system-prompt-optimizer and synthetic-dataset-v2 skills#1
Baukebrenninkmeijer merged 2 commits into
mainfrom
aminaakhmedova/res-355-implement-mcp-skills-in-orqkit-for-mcp-usage

Conversation

@currentlycodinng
Copy link
Copy Markdown
Collaborator

Summary

  • RES-398: New system-prompt-optimizer skill — automated prompt optimization using orq.ai's PO1 (analyzer) and PO2 (rewriter) deployments with Quick Optimize and Advanced Optimize workflows
  • RES-399: New generate-synthetic-dataset-v2 skill — deployment-based synthetic data generation with two modes: create from scratch via description, and expand existing datasets with few-shot examples
  • Auto-generated artifacts updated (AGENTS.md, README.md)

Test plan

  • Verify ./scripts/publish.sh --check passes (no stale artifacts)
  • Review system-prompt-optimizer/SKILL.md for PO1/PO2 schema accuracy against deployed configurations
  • Review generate-synthetic-dataset-v2/SKILL.md for deployment I/O alignment with RES-33
  • Verify both skills are discoverable via MCP tool listing
  • Test Quick Optimize workflow end-to-end on a sample prompt
  • Test Mode 1 (create from scratch) dataset generation

🤖 Generated with Claude Code

Implements RES-398 (automated prompt optimization via PO1/PO2 deployments)
and RES-399 (deployment-based synthetic data generation with create and
expand modes) as MCP-compatible agent skills.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@linear
Copy link
Copy Markdown

linear Bot commented Mar 10, 2026

RES-355 Implement MCP skills in orqkit for mcp usage

As reference, look at https://github.com/huggingface/skills

Ticket to transform into skills:

Comment thread skills/system-prompt-optimizer/SKILL.md Outdated
### 3. Preserve Intent
The optimizer should improve how the prompt is expressed, not change what it does. Always verify the optimized prompt preserves the original intent, persona, and constraints.

### 4. Validate with Experiments
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not able to be done with current mcp. Should we omit this?

Comment thread skills/system-prompt-optimizer/SKILL.md Outdated

When presenting PO1/PO2 suggestions to the user, reference which guideline each suggestion targets to help them understand the reasoning.

### Model Configuration
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels unnecessary to include

Comment thread skills/system-prompt-optimizer/SKILL.md Outdated

# System Prompt Optimizer

Automated prompt optimization using orq.ai's PO1 (analyzer) and PO2 (rewriter) deployments — get AI-powered analysis and rewriting of system prompts without manual trace analysis.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned, we should do this all within the coding agent, and not use deployments externally for this. Simplifies the solution and imo makes more sense.

Comment thread skills/system-prompt-optimizer/SKILL.md Outdated
- `run-experiment` — validate optimized prompts with A/B experiments
- `manage-deployment` — configure deployments with the optimized prompt

## When to use
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure we need this. It's invoked by the user themselves.

Comment thread skills/system-prompt-optimizer/SKILL.md Outdated
allowed-tools: Bash, Read, Write, Edit, Grep, Glob, WebFetch, Task, AskUserQuestion, mcp__linear-server__*, orq*
---

# System Prompt Optimizer
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PO1 is essentially an analyzer, and po2 which optimizes a prompt given some instructions or suggestions. To be used one after another.
In the context of this skill, let's just make it a 2 step process. Step 1 can be somewhat skipped if the user already gave specific instructions in what way they want the prompt to be changed.
Examples
/prompt-optimizer - start with analysis, applies the findings
/prompt-optimizer make this way more assertive - analysis not needed, instructions already given. Start directly with step 2.

---
name: generate-synthetic-dataset-v2
description: Deployment-based synthetic data generation — create datasets from descriptions or expand existing datasets with few-shot examples
allowed-tools: Bash, Read, Write, Edit, Grep, Glob, WebFetch, Task, AskUserQuestion, mcp__linear-server__*, orq*
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs access to linear?


### orq MCP Tools

Use the orq MCP server (`https://my.orq.ai/v2/mcp`) as the primary interface. For operations not yet available via MCP, use the HTTP API as fallback.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps also have it use the api if the data is really big. mcp is not ideal for that.

Comment thread agents/AGENTS.md Outdated
- evaluate-rag -> "skills/evaluate-rag/SKILL.md"
- feedback-loop -> "skills/feedback-loop/SKILL.md"
- generate-synthetic-dataset -> "skills/generate-synthetic-dataset/SKILL.md"
- generate-synthetic-dataset-v2 -> "skills/generate-synthetic-dataset-v2/SKILL.md"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as mentioned, let's see if can deprecate v1 so we only have 1 version. If you think there are things missing in v2 that are in v1, please add them.

…thetic dataset skills

- Rename system-prompt-optimizer → prompt-optimizer with agent-native
  analysis/rewriting (no external deployments)
- Merge generate-synthetic-dataset v1+v2 into single skill with 3 modes
  (structured, quick, expand)
- Remove deployment dependencies per reviewer feedback
- Add HTTP API bulk fallback for large datasets
- Update AGENTS.md, README.md, and optimize-prompt companion refs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@arianpasquali
Copy link
Copy Markdown
Collaborator

@Baukebrenninkmeijer I made the corrections based on yout feedback. consolidated into 1 prompt-optimizer skill and one dataset generation skill.

I will leave the other structural changes for another PR.

@Baukebrenninkmeijer Baukebrenninkmeijer merged commit 7881ceb into main Mar 13, 2026
1 check failed
arianpasquali added a commit that referenced this pull request Apr 2, 2026
- Remove invalid `metadata` field from memory creation examples (issues #1)
- Fix memory document field name from `content` to `text` (issue #2)
- Add required `path` field to tool creation example (issue #3)
- Fix KB search param from `limit` to `top_k` (issue #4)
- Correct MCP URL to `https://my.orq.ai/v2/mcp` in run-experiment and
  generate-synthetic-dataset (issue #5)
- Replace non-existent `setup-observability` with `analyze-trace-failures`
  in compare-agents companion skills (issue #6)
- Update stale TOC entry in knowledge-base-management (issue #7)
- Add missing `path` and `type` to KB creation in run-experiment (issue #8)
- Add explicit `-X POST` to KB search curl commands (issue #9)
- Fix "two things" wording when listing three items (issue #10)
- Standardize agent model format to object style (issue #11)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants