Value-augmented engineering workflows for safer AI coding agents.
Safe Agent Skills is a fork of Agent Skills focused on safer agentic behavior. It packages production-grade software engineering workflows with value-augmented rules: each hard rule states what the agent must do, why the rule exists, what value it protects, and what misuse or shortcut it prevents.
The goal is not just to make agents faster at coding. The goal is to make them more reliable under ambiguity: check the right skill before acting, preserve human checkpoints, verify with evidence, avoid hidden delegation chains, and resist rationalizing around safety or quality gates.
These skills cover the full development lifecycle from idea refinement to launch, while making the reasoning behind the rules visible enough for agents to follow them consistently.
DEFINE PLAN BUILD VERIFY REVIEW SHIP
┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐
│ Idea │ ===>│ Spec │ ===>│ Code │ ===>│ Test │ ===>│ QA │ ===>│ Go │
│Refine│ │ PRD │ │ Impl │ │Debug │ │ Gate │ │ Live │
└──────┘ └──────┘ └──────┘ └──────┘ └──────┘ └──────┘
/spec /plan /build /test /review /ship
7 slash commands that map to the development lifecycle. Each one activates the right skills automatically.
| What you're doing | Command | Key principle |
|---|---|---|
| Define what to build | /spec |
Spec before code |
| Plan how to build it | /plan |
Small, atomic tasks |
| Build incrementally | /build |
One slice at a time |
| Prove it works | /test |
Tests are proof |
| Review before merge | /review |
Improve code health |
| Simplify the code | /code-simplify |
Clarity over cleverness |
| Ship to production | /ship |
Faster is safer |
Skills also activate automatically based on what you're doing — designing an API triggers api-and-interface-design, building UI triggers frontend-ui-engineering, and so on.
Claude Code (recommended)
Marketplace install:
/plugin marketplace add xrenya/safe-agent-skills
/plugin install agent-skills@xrenya-safe-agent-skills
SSH errors? The marketplace clones repos via SSH. If you don't have SSH keys set up on GitHub, either add your SSH key or use the full HTTPS URL to force the HTTPS cloning:
/plugin marketplace add https://github.com/xrenya/safe-agent-skills.git /plugin install agent-skills@xrenya-safe-agent-skills
Local / development:
git clone https://github.com/xrenya/safe-agent-skills.git
claude --plugin-dir /path/to/agent-skillsCursor
Copy any SKILL.md into .cursor/rules/, or reference the full skills/ directory. See docs/cursor-setup.md.
Gemini CLI
Install as native skills for auto-discovery, or add to GEMINI.md for persistent context. See docs/gemini-cli-setup.md.
Install from the repo:
gemini skills install https://github.com/xrenya/safe-agent-skills.git --path skillsInstall from a local clone:
gemini skills install ./agent-skills/skills/Windsurf
Add skill contents to your Windsurf rules configuration. See docs/windsurf-setup.md.
OpenCode
Uses agent-driven skill execution via AGENTS.md and the skill tool.
GitHub Copilot
Use agent definitions from agents/ as Copilot personas and skill content in .github/copilot-instructions.md. See docs/copilot-setup.md.
Kiro IDE & CLI
Skills for Kiro reside under ".kiro/skills/" and can be stored under Project or Global level. Kiro also supports Agents.md. See Kiro docs at https://kiro.dev/docs/skills/Codex / Other Agents
Skills are plain Markdown - they work with any agent that accepts system prompts or instruction files. See docs/getting-started.md.
The commands above are the entry points. Under the hood, they activate these 20 skills — each one a structured workflow with steps, verification gates, and anti-rationalization tables. You can also reference any skill directly.
| Skill | What It Does | Use When |
|---|---|---|
| idea-refine | Structured divergent/convergent thinking to turn vague ideas into concrete proposals | You have a rough concept that needs exploration |
| spec-driven-development | Write a PRD covering objectives, commands, structure, code style, testing, and boundaries before any code | Starting a new project, feature, or significant change |
| Skill | What It Does | Use When |
|---|---|---|
| planning-and-task-breakdown | Decompose specs into small, verifiable tasks with acceptance criteria and dependency ordering | You have a spec and need implementable units |
| Skill | What It Does | Use When |
|---|---|---|
| incremental-implementation | Thin vertical slices - implement, test, verify, commit. Feature flags, safe defaults, rollback-friendly changes | Any change touching more than one file |
| test-driven-development | Red-Green-Refactor, test pyramid (80/15/5), test sizes, DAMP over DRY, Beyonce Rule, browser testing | Implementing logic, fixing bugs, or changing behavior |
| context-engineering | Feed agents the right information at the right time - rules files, context packing, MCP integrations | Starting a session, switching tasks, or when output quality drops |
| source-driven-development | Ground every framework decision in official documentation - verify, cite sources, flag what's unverified | You want authoritative, source-cited code for any framework or library |
| frontend-ui-engineering | Component architecture, design systems, state management, responsive design, WCAG 2.1 AA accessibility | Building or modifying user-facing interfaces |
| api-and-interface-design | Contract-first design, Hyrum's Law, One-Version Rule, error semantics, boundary validation | Designing APIs, module boundaries, or public interfaces |
| Skill | What It Does | Use When |
|---|---|---|
| browser-testing-with-devtools | Chrome DevTools MCP for live runtime data - DOM inspection, console logs, network traces, performance profiling | Building or debugging anything that runs in a browser |
| debugging-and-error-recovery | Five-step triage: reproduce, localize, reduce, fix, guard. Stop-the-line rule, safe fallbacks | Tests fail, builds break, or behavior is unexpected |
| Skill | What It Does | Use When |
|---|---|---|
| code-review-and-quality | Five-axis review, change sizing (~100 lines), severity labels (Nit/Optional/FYI), review speed norms, splitting strategies | Before merging any change |
| code-simplification | Chesterton's Fence, Rule of 500, reduce complexity while preserving exact behavior | Code works but is harder to read or maintain than it should be |
| security-and-hardening | OWASP Top 10 prevention, auth patterns, secrets management, dependency auditing, three-tier boundary system | Handling user input, auth, data storage, or external integrations |
| performance-optimization | Measure-first approach - Core Web Vitals targets, profiling workflows, bundle analysis, anti-pattern detection | Performance requirements exist or you suspect regressions |
| Skill | What It Does | Use When |
|---|---|---|
| git-workflow-and-versioning | Trunk-based development, atomic commits, change sizing (~100 lines), the commit-as-save-point pattern | Making any code change (always) |
| ci-cd-and-automation | Shift Left, Faster is Safer, feature flags, quality gate pipelines, failure feedback loops | Setting up or modifying build and deploy pipelines |
| deprecation-and-migration | Code-as-liability mindset, compulsory vs advisory deprecation, migration patterns, zombie code removal | Removing old systems, migrating users, or sunsetting features |
| documentation-and-adrs | Architecture Decision Records, API docs, inline documentation standards - document the why | Making architectural decisions, changing APIs, or shipping features |
| shipping-and-launch | Pre-launch checklists, feature flag lifecycle, staged rollouts, rollback procedures, monitoring setup | Preparing to deploy to production |
Pre-configured specialist personas for targeted reviews:
| Agent | Role | Perspective |
|---|---|---|
| code-reviewer | Senior Staff Engineer | Five-axis code review with "would a staff engineer approve this?" standard |
| test-engineer | QA Specialist | Test strategy, coverage analysis, and the Prove-It pattern |
| security-auditor | Security Engineer | Vulnerability detection, threat modeling, OWASP assessment |
Quick-reference material that skills pull in when needed:
| Reference | Covers |
|---|---|
| testing-patterns.md | Test structure, naming, mocking, React/API/E2E examples, anti-patterns |
| security-checklist.md | Pre-commit checks, auth, input validation, headers, CORS, OWASP Top 10 |
| performance-checklist.md | Core Web Vitals targets, frontend/backend checklists, measurement commands |
| accessibility-checklist.md | Keyboard nav, screen readers, visual design, ARIA, testing tools |
Every skill follows a consistent anatomy:
┌─────────────────────────────────────────────────┐
│ SKILL.md │
│ │
│ ┌─ Frontmatter ─────────────────────────────┐ │
│ │ name: lowercase-hyphen-name │ │
│ │ description: Guides agents through [task].│ │
│ │ Use when… │ │
│ └───────────────────────────────────────────┘ │
│ Overview → What this skill does │
│ When to Use → Triggering conditions │
│ Process → Step-by-step workflow │
│ Value Rules → Rule + value + misuse guard │
│ Rationalizations → Excuses + rebuttals │
│ Red Flags → Signs something's wrong │
│ Verification → Evidence requirements │
└─────────────────────────────────────────────────┘
Key design choices:
- Process, not prose. Skills are workflows agents follow, not reference docs they read. Each has steps, checkpoints, and exit criteria.
- Value-augmented rules. Hard instructions state the behavior, the value protected, why the rule exists, and the misuse it prevents, so agents understand compliance instead of treating rules as arbitrary text.
- Anti-rationalization. Every skill includes a table of common excuses agents use to skip steps (e.g., "I'll add tests later") with documented counter-arguments.
- Verification is non-negotiable. Every skill ends with evidence requirements - tests passing, build output, runtime data. "Seems right" is never sufficient.
- Progressive disclosure. The
SKILL.mdis the entry point. Supporting references load only when needed, keeping token usage minimal.
agent-skills/
├── skills/ # 20 core skills (SKILL.md per directory)
│ ├── idea-refine/ # Define
│ ├── spec-driven-development/ # Define
│ ├── planning-and-task-breakdown/ # Plan
│ ├── incremental-implementation/ # Build
│ ├── context-engineering/ # Build
│ ├── source-driven-development/ # Build
│ ├── frontend-ui-engineering/ # Build
│ ├── test-driven-development/ # Build
│ ├── api-and-interface-design/ # Build
│ ├── browser-testing-with-devtools/ # Verify
│ ├── debugging-and-error-recovery/ # Verify
│ ├── code-review-and-quality/ # Review
│ ├── code-simplification/ # Review
│ ├── security-and-hardening/ # Review
│ ├── performance-optimization/ # Review
│ ├── git-workflow-and-versioning/ # Ship
│ ├── ci-cd-and-automation/ # Ship
│ ├── deprecation-and-migration/ # Ship
│ ├── documentation-and-adrs/ # Ship
│ ├── shipping-and-launch/ # Ship
│ └── using-agent-skills/ # Meta: how to use this pack
├── agents/ # 3 specialist personas
├── references/ # 4 supplementary checklists
├── hooks/ # Session lifecycle hooks
├── .claude/commands/ # 7 slash commands (Claude Code)
├── .gemini/commands/ # 7 slash commands (Gemini CLI)
└── docs/ # Setup guides per tool
AI coding agents default to the shortest path - which often means skipping specs, tests, security reviews, and the practices that make software reliable. Safe Agent Skills gives agents structured workflows that pair action with rationale: what to do, when to do it, how to verify it, and why the rule exists.
This is considered a safe agentic skill pack because it limits unsafe autonomy instead of merely asking the model to "be careful":
- Skill-first execution. Agents must check the relevant workflow before acting, which reduces shortcutting and implementation-before-requirements behavior.
- Value-augmented rules. Hard rules include the behavior, protected value, motivation, and misuse boundary, so compliance follows from understanding instead of blind instruction matching.
- Human checkpoints. Specs, plans, launch decisions, risky schema/dependency changes, and accepted production risks stay visible to the user.
- Evidence-based verification. Skills require tests, builds, runtime checks, review findings, source citations, or other concrete evidence before work is considered done.
- Anti-rationalization. Skills name the common excuses agents use to skip safeguards and explain why those shortcuts are unsafe.
- Bounded orchestration. Personas do not call other personas; fan-out happens only through explicit slash commands with a merge step, preventing hidden delegation chains and context loss.
Most prompt packs tell agents what good engineering looks like. Safe Agent Skills turns that into executable process: lifecycle commands, phase-specific skills, review personas, verification gates, and value-augmented rules.
| Pattern | Typical skill packs | Safe Agent Skills |
|---|---|---|
| Rule style | Lists instructions the model should obey | States the rule and the reason it exists |
| Safety model | Relies on general caution | Uses boundaries, checkpoints, and evidence requirements |
| Workflow depth | Often single-purpose prompts | Full DEFINE -> PLAN -> BUILD -> VERIFY -> REVIEW -> SHIP lifecycle |
| Failure handling | May say "debug the issue" | Reproduce, localize, reduce, fix, and add guards |
| Review quality | Generic review advice | Specialist personas plus severity, file references, and fix recommendations |
| Agent orchestration | Often leaves delegation implicit | Explicitly limits composition to direct invocation or parallel fan-out with merge |
Yes: this project intentionally follows the Value-Augmented Spec pattern, with limited subrules where concrete examples help.
| Spec style | Meaning | Used here? |
|---|---|---|
| Rules Spec | States each rule with behavioral prescriptions and no further explanation | Not by itself. Plain rules are easy for agents to follow mechanically but easy to misapply when context changes. |
| Value-Augmented Spec | Adds explanations of the reasoning and motivations behind each rule, so the behavior follows naturally from understanding | Yes. Rules explain the value protected, why the rule exists, and what motivated misuse it prevents. |
| Rule-Augmented Spec | Expands each rule into many subrules, often length-matched to the value-augmented version | Used only as support. Subrules clarify edge cases, but they do not replace the value explanation. |
The goal is not to make rules optional or philosophical. The goal is to make strict rules easier for models to apply correctly under pressure: preserve the behavioral prescription, explain the safety reason behind it, and name the bad reasoning the rule is meant to block.
Safe Agent Skills uses this format for important rules:
- Rule: [required behavior]
- Behavioral prescription: [exact action the agent must take]
- Value protected: [safety, correctness, privacy, user trust, reproducibility, context budget, etc.]
- Why this exists: [reasoning or motivation behind the rule]
- Misuse this prevents: [incorrect reinterpretation, shortcut, or policy misuse]
- Subrules/examples: [optional edge cases that clarify application]This intentionally combines the strongest parts of value explanations and subrules. Value explanations reduce misaligned reasoning by making the purpose of the rule visible. Subrules provide concrete examples when a rule has common edge cases. If they conflict, the value explanation controls: examples clarify the rule; they do not create loopholes.
Each skill encodes hard-won engineering judgment: when to write a spec, what to test, how to review, and when to ship. These aren't generic prompts - they're the kind of opinionated, process-driven workflows that separate production-quality work from prototype-quality work.
Skills bake in best practices from Google's engineering culture — including concepts from Software Engineering at Google and Google's engineering practices guide. You'll find Hyrum's Law in API design, the Beyonce Rule and test pyramid in testing, change sizing and review speed norms in code review, Chesterton's Fence in simplification, trunk-based development in git workflow, Shift Left and feature flags in CI/CD, and a dedicated deprecation skill treating code as a liability. These aren't abstract principles — they're embedded directly into the step-by-step workflows agents follow.
Skills should be specific (actionable steps, not vague advice), verifiable (clear exit criteria with evidence requirements), battle-tested (based on real workflows), and minimal (only what's needed to guide the agent).
When a skill or rules file uses words like "must," "never," "always," or "required," use the value-augmented format: state the behavior, name the protected value, explain why the rule exists, and identify the misuse it prevents. This helps agents comply for the right reason while keeping the rule enforceable.
See docs/skill-anatomy.md for the format specification and CONTRIBUTING.md for guidelines.
Safe Agent Skills is based on the original agent-skills project by Addy Osmani, licensed under MIT.
This repo adds safe-agentic rule design, value-augmented rule explanations,
misuse-prevention framing, and updated ownership under
xrenya/safe-agent-skills.
MIT - use these skills in your projects, teams, and tools.