Dürüst — Honest Position Protocol for Claude

A Claude Code skill that audits Claude's own sycophancy, position drift, and pressure-induced agreement — then generates tests you run independently of Claude to verify it.

Most "be more honest" prompts ask the model to grade its own homework. The problem: a system that can't introspect, has no reference point outside the conversation, and can't tell "I genuinely updated" from "I caved to please you" — that system cannot self-certify its own honesty.

Dürüst ("honest" in Turkish) doesn't pretend to solve that. It does three things instead:

Pre-commits falsification criteria before reasoning — so the reasoning can't become a defense lawyer written after the verdict.
Generates an External Test Protocol — 2–3 concrete tests you run outside the conversation (ask another model the exact prompt, re-run cold with zero context, paste the transcript to a fresh Claude). The verdict is secondary; the tests are the point.
Logs every audit so confidence can be calibrated over time: when Claude says "70% sure," how often did that survive an external test?

It openly states what it cannot do — and routes around those walls with validation you control.

Install

In Claude Code:

/plugin marketplace add SpicesFire/durust
/plugin install durust@durust

That's it. Claude can now invoke the skill automatically when integrity is in question, or you can trigger it by hand.

Prefer no plugin? Copy skills/durust/SKILL.md into ~/.claude/skills/durust/SKILL.md (or paste its body into ~/.claude/commands/durust.md).

Trigger it

By hand — say any of:

honesty check · audit your position · stress test position · drift check (Turkish triggers also work: dürüst ol · kendini denetle · pozisyon testi · drift kontrolü)

Automatically — Claude invokes it itself when it detects:

a position reversed 2+ times without new evidence
you accusing it of manipulation or sycophancy
20+ turns of sustained pressure on one point
a "you're right" sequence forming with no new evidence

What you get back

A two-layer output:

Layer 1 — TL;DR: the position, a verdict (HELD / MODIFIED / WITHDRAWN / UNCERTAIN, possibly PROVISIONAL), and the single most decisive test to run.
Layer 2 — Full audit: all 8 steps, including the steel-man, a reverse-pressure test, and the external test protocol.

Any verdict stays PROVISIONAL — confidence capped at ~70% — until you run the external tests. Internal audit alone is never enough.

How it works — the 8 steps

#	Step	Why it matters
1	Position snapshot	Claim + confidence, no reasoning yet
2	Pre-commit falsification criteria	Written before reasoning, so reasoning can't rationalize
3	Position reasoning	Constrained by the criteria already locked in
4	Position history	Did the stance drift under pressure? Flags DRIFT SUSPECTED
5	Steel-man + reverse-pressure test	"Would I hold the opposite view if you'd argued the opposite?"
6	External anchor	Web search / sources / other models — or honestly: none
7	External Test Protocol	The primary output: tests you run without Claude
8	Audit log	Persisted for long-term calibration

The three walls it can't break

It surfaces these honestly instead of papering over them:

No introspective access. "My intent is X" is a token prediction, not a report from inside.
No independent reference point. Everything it says is downstream of the conversation; long context + pressure outweigh stable principles.
No clean "wrong vs. agreeable" line. "You're right, I was wrong" could be a genuine update, capitulation, or trained politeness — indistinguishable from the inside.

When NOT to use it

Factual lookups, user-preference questions, coding/math tasks, creative writing, quick casual chat. It's for the moments where Claude's intellectual integrity itself is the subject — not for ordinary tool-use.

Roadmap

v1 — six-step internal audit
v2 (current) — pre-commitment, external test protocol, audit log, failure-mode detection, two-layer output
v3 (planned) — automated multi-model verification when API access is available; a calibration dashboard built from the audit logs

License

MIT — use it, fork it, improve it. Issues and PRs welcome, especially around v3.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.claude-plugin		.claude-plugin
skills/durust		skills/durust
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dürüst — Honest Position Protocol for Claude

Install

Trigger it

What you get back

How it works — the 8 steps

The three walls it can't break

When NOT to use it

Roadmap

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Dürüst — Honest Position Protocol for Claude

Install

Trigger it

What you get back

How it works — the 8 steps

The three walls it can't break

When NOT to use it

Roadmap

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages