English | νκ΅μ΄
ββββββββββββββββββββββββββββββββββββββββββββββββ
Every legendary weapon starts as raw material. Through heat, strikes, and tempering, ordinary metal becomes extraordinary.
%%{init: {'theme': 'base', 'themeVariables': {
'primaryColor': '#2D1810',
'primaryTextColor': '#FFD700',
'primaryBorderColor': '#FF6B00',
'lineColor': '#FFB800',
'secondaryColor': '#1A0A00',
'tertiaryColor': '#1A0A00'
}}}%%
graph LR
A["βοΈ RAW<br/>SKILL"] -->|"π₯ HEAT"| B["π ANALYZE<br/>Structure"]
B -->|"π¨ STRIKE"| C["β‘ EVOLVE<br/>Refine"]
C -->|"π§ TEMPER"| D["β
VERIFY<br/>Tests"]
D -->|"βοΈ"| E["β¨ LEGENDARY"]
style A fill:#2D1810,stroke:#A0A0A0,stroke-width:2px,color:#A0A0A0
style B fill:#1A0A00,stroke:#FF6B00,stroke-width:3px,color:#FFB800
style C fill:#1A0A00,stroke:#FFB800,stroke-width:3px,color:#FFD700
style D fill:#2D1810,stroke:#FF6B00,stroke-width:2px,color:#FFD700
style E fill:#FFD700,stroke:#FFD700,color:#1A0A00,stroke-width:4px
The Forge never rests β Each skill is heated in analysis, struck with improvements, tempered by tests, and emerges stronger.
ββββββββββββββββββββββββββββββββββββββββββββββββ
Before firing up the forge, ensure you have the required tools:
| Requirement | Version | Check |
|---|---|---|
| Bash | 4.0+ | bash --version |
| Git | 2.0+ | git --version |
| Python 3 | 3.6+ | python3 --version |
| bc | any | which bc |
| jq | 1.6+ | jq --version |
| Claude Code CLI | latest | claude --version |
| Variable | Default | Description |
|---|---|---|
CLAUDE_PLUGIN_ROOT |
(your plugin install directory) | Plugin installation path |
FORGE_EVALUATOR_CMD |
(built-in) | Custom evaluator script path |
ββββββββββββββββββββββββββββββββββββββββββββββββ
# Install the forge
git clone https://github.com/quantsquirrel/claude-forge-smith.git \
"$CLAUDE_PLUGIN_ROOT"
# Ignite the flames
/forge:forge --scanββββββββββββββββββββββββββββββββββββββββββββββββ
| π¨ Forged in Fire | β‘ Auto Evolution | π‘οΈ Safe Trials | π Triple Strike |
|---|---|---|---|
| Every change tested | 3Γ evaluation consensus | Original preserved | 95% CI validation |
Skills can be forged through two methods depending on material quality:
| Path | Condition | Technique |
|---|---|---|
| βοΈ TDD Forge | Test files exist | Statistical validation (95% CI) |
| π₯ Pattern Forge | No tests | Usage patterns + heuristic analysis |
# Check forging method
source hooks/lib/storage-local.sh
get_upgrade_mode "my-skill" # Returns: TDD_FIT or HEURISTICTrack your weapons and see which need reforging:
/monitor [--priority=HIGH|MED|LOW] [--type=explicit|silent|all]
Output:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β π₯ Forge Monitor β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ£
β Quality Analysis (νμ§ κΈ°λ° - μ¬μ©λκ³Ό 무κ΄) β
β βββββββββββββββββββββββββ€βββββββββββ€ββββββββ€βββββββββββ€ββββββββββββββββ£
β Skill β Type β Score β Grade β Priority β
β βββββββββββββββββββββββββͺβββββββββββͺββββββββͺβββββββββββͺββββββββββββββββ£
β omc:git-master β silent β 45 β C β [HIGH] β‘ β
β forge:forge β explicit β 90 β A β [READY] β β
ββββββββββββββββββββββββββ§βββββββββββ§ββββββββ§βββββββββββ§ββββββββββββββββ
Skills are classified by how they're invoked:
| Type | Description | Quality Criteria |
|---|---|---|
| explicit | User invokes with /command |
argument-hint, mode docs, examples |
| silent | Auto-triggered by context | trigger keywords, when-to-use, red-flags |
# Check skill type
source hooks/lib/storage-local.sh
get_skill_type "my-skill" # Returns: explicit | silentCore Principle: Usage β Quality
The forge evaluates skills by structure, not popularity:
| Priority | Score | Action |
|---|---|---|
| HIGH | < 40 | Immediate reforging needed |
| MED | 40-59 | Improvement recommended |
| LOW | 60-79 | Optional enhancement |
| READY | β₯ 80 | Quality assured |
# Get quality score
get_skill_quality_score "my-skill"
# Returns: JSON with score, breakdown, grade (A/B/C/D)Exceptional weapons earn special marks:
| Enhancement | Bonus | Forged When |
|---|---|---|
| Reforged | +1 | upgraded: true |
| Efficient | +0.5 | tokens/usage < 1500 |
| Hot Streak | +0.5 | positive trend |
| Tested | +0.5 | has test files |
S + Reforged + Efficient = β β β SSS LEGENDARY
ββββββββββββββββββββββββββββββββββββββββββββββββ
Master smiths never work directly on the masterpiece. They test on trial pieces first.
%%{init: {'theme': 'base', 'themeVariables': {
'primaryColor': '#2D1810',
'primaryTextColor': '#FFD700',
'primaryBorderColor': '#FF6B00',
'lineColor': '#FFB800',
'secondaryColor': '#1A0A00',
'tertiaryColor': '#1A0A00'
}}}%%
flowchart TB
subgraph MAIN["βοΈ main (Master Weapon)"]
direction LR
C1["v0.6<br/>71pts"]
C2["v0.7<br/>90pts"]
C1 -.-> C2
end
subgraph TRIAL["π₯ trial/skill-name (Testing Anvil)"]
direction LR
T1["π¨ Strike"]
T2["π¨ Strike"]
T3["π¨ Strike"]
T4{"Worthy?"}
T1 --> T2 --> T3 --> T4
end
C1 -->|"fork"| T1
T4 -->|"β
Stronger"| C2
T4 -->|"β Brittle"| D["ποΈ Discard"]
style C1 fill:#2D1810,stroke:#FFD700,stroke-width:2px,color:#FFD700
style C2 fill:#FFD700,stroke:#FFD700,color:#1A0A00,stroke-width:3px
style T1 fill:#1A0A00,stroke:#FF6B00,stroke-width:2px,color:#FFB800
style T2 fill:#1A0A00,stroke:#FF6B00,stroke-width:2px,color:#FFB800
style T3 fill:#1A0A00,stroke:#FF6B00,stroke-width:2px,color:#FFB800
style T4 fill:#2D1810,stroke:#FF6B00,stroke-width:2px,color:#FFD700
style D fill:#1A0A00,stroke:#A0A0A0,stroke-width:1px,color:#A0A0A0
Safety First β The master weapon (main) is never touched until the trial proves worthy. Failed experiments are discarded, not merged.
ββββββββββββββββββββββββββββββββββββββββββββββββ
A single hammer blow can deceive. Three strikes reveal the truth.
%%{init: {'theme': 'base', 'themeVariables': {
'primaryColor': '#2D1810',
'primaryTextColor': '#FFD700',
'primaryBorderColor': '#FF6B00',
'lineColor': '#FFB800',
'secondaryColor': '#1A0A00',
'tertiaryColor': '#1A0A00'
}}}%%
flowchart LR
subgraph STRIKE["π¨ Triple Strike Evaluation"]
direction TB
S1["π¨ Smith 1<br/>Score: 78"]
S2["π¨ Smith 2<br/>Score: 81"]
S3["π¨ Smith 3<br/>Score: 79"]
end
subgraph MEASURE["βοΈ Measure Quality"]
direction TB
M1["Mean: 79.3"]
M2["95% Confidence"]
end
subgraph VERDICT["βοΈ Final Verdict"]
V1{"Stronger than<br/>before?"}
V1 -->|"YES"| ACCEPT["β
REFORGE"]
V1 -->|"NO"| REJECT["β DISCARD"]
end
STRIKE --> MEASURE --> VERDICT
style S1 fill:#1A0A00,stroke:#FFB800,stroke-width:2px,color:#FFD700
style S2 fill:#1A0A00,stroke:#FFB800,stroke-width:2px,color:#FFD700
style S3 fill:#1A0A00,stroke:#FFB800,stroke-width:2px,color:#FFD700
style M1 fill:#2D1810,stroke:#FF6B00,stroke-width:2px,color:#FFD700
style M2 fill:#2D1810,stroke:#FF6B00,stroke-width:2px,color:#FFD700
style ACCEPT fill:#FFD700,stroke:#FFD700,color:#1A0A00,stroke-width:3px
style REJECT fill:#1A0A00,stroke:#A0A0A0,stroke-width:1px,color:#A0A0A0
Statistical Consensus β Three independent evaluations. Statistical confidence intervals. Only merge if the new version is provably superior.
ββββββββββββββββββββββββββββββββββββββββββββββββ
Before: 71 points β Raw, unrefined After: 90.33 points β Tempered, legendary
+27% improvement β Forge reforged itself
The ultimate test: A tool that improves itself through its own process.
ββββββββββββββββββββββββββββββββββββββββββββββββ
Master smiths build in multiple safeguards:
| Safeguard | Protection |
|---|---|
| π Rollback Ready | Original always preserved |
| π Isolated Trials | Test in separate branch |
| π Full Logs | Every strike recorded |
| β±οΈ Iteration Limit | Maximum 6 attempts |
| β Test Verification | All tests must pass |
No weapon leaves the forge untested. No master version is ever corrupted.
ββββββββββββββββββββββββββββββββββββββββββββββββ
| Command | Action |
|---|---|
/forge:forge --scan |
π Scout for skills ready to reforge |
/forge:forge <skill> |
β‘ Reforge a specific skill |
/forge:forge --history |
π View forging chronicles |
/forge:forge --watch |
ποΈ Monitor the forge |
/forge:monitor |
π Quality dashboard |
/forge:smelt |
π₯ Skill creation with TDD methodology |
When typing a slash command, you'll see available modes:
/forge <skill-name> [--precision=high|-n5] - modes: TDD_FIT|HEURISTIC
/monitor [--priority=HIGH|MED|LOW] [--type=explicit|silent|all]
Add argument-hint to your skill's frontmatter to enable this feature.
ββββββββββββββββββββββββββββββββββββββββββββββββ
Forge behavior can be customized via config/settings.env:
| Setting | Default | Description |
|---|---|---|
STORAGE_MODE |
local |
Storage backend (currently only local supported) |
LOCAL_STORAGE_DIR |
~/.claude/.skill-evaluator |
Local storage directory for skill data |
SKILL_EVAL_DEBUG |
false |
Enable debug logging to stderr |
Example:
# Enable debug mode
export SKILL_EVAL_DEBUG=true
# Use custom storage location
export LOCAL_STORAGE_DIR="$HOME/.my-forge-data"ββββββββββββββββββββββββββββββββββββββββββββββββ
# macOS
brew install bc
# Ubuntu/Debian
sudo apt-get install bc
# Fedora/RHEL
sudo dnf install bc# macOS
brew install jq
# Ubuntu/Debian
sudo apt-get install jq
# Fedora/RHEL
sudo dnf install jq# Make scripts executable
cd "$CLAUDE_PLUGIN_ROOT"
chmod +x hooks/*.sh
chmod +x bin/*- Check installation path matches
CLAUDE_PLUGIN_ROOT - Verify
plugin.jsonexists in the plugin root - Restart Claude Code CLI
- Run
/helpto see if Forge commands appear
# Enable debug logging
export SKILL_EVAL_DEBUG=true
# Check storage directory exists
ls -la ~/.claude/.skill-evaluator
# Verify evaluator script is executable
ls -la "$CLAUDE_PLUGIN_ROOT/bin/skill-evaluator.py"ββββββββββββββββββββββββββββββββββββββββββββββββ
GΓΆdel Machines (Schmidhuber 2007) β Self-referential systems that can improve their own code Dynamic Adaptation β Incremental evolution with statistical validation TDD Safety Boundaries β Tests prevent catastrophic self-modification Multi-Evaluator Consensus β Multiple independent judges reduce bias
ββββββββββββββββββββββββββββββββββββββββββββββββ
Inspired by skill-up
βοΈ Forged with Claude Code Β· π₯ MIT License Β· βοΈ v1.0
This project is not affiliated with or endorsed by Anthropic. Claude and Claude Code are trademarks of Anthropic PBC.
