claude-language-tutor

A four-skill orchestrator system that turns Claude into a structured language tutor with persistent state, error tracking, and a chapter gate protocol. Currently supports Dutch DUO inburgering at A2 and B1, with NT2 staatsexamen (CvTE, B2/C1) planned. The architecture generalizes to any structured language exam.

Status: used daily for Dutch A2 DUO since May 2026. Architecture stable. The polyglot generalist skill has been contributed but not yet validated against a second learner running the full flow. See ROADMAP.md for what's planned.

Best for: adult learners with a real exam deadline, a primary textbook, and limited weekly bandwidth. Not for: casual language exploration, gamified daily-streak learners, or anyone who wants Claude to surprise them with the session focus rather than picking based on tracked state.

What this is

Most AI language tutors are stateless: each conversation starts cold, the same errors get re-explained, and there's no way to enforce mastery before advancing. This system fixes that.

It runs four-phase sessions (cold recall, error focus, new content, cumulative exercise), tracks errors persistently across sessions with confidence scoring, and enforces a five-check gate before letting the learner advance between chapters. A session log captures what worked and what didn't pedagogically, so the system itself improves over time, not just the learner. The repo also integrates a polyglot language-tutor skill that handles general language tasks outside the structured exam-prep flow, so the same project can support both deep exam drilling and broad conversational practice. Sessions run 30 to 45 minutes and end with a fresh progress.md you swap into the project for next time.

The core method is substitution drilling. Fix a sentence frame in the target language, swap one variable, repeat. The drill never asks the learner to translate from English, because translation creates an English-first mental path that causes the "freeze when speaking" problem on exam day.

What a session looks like

The flow of an actual mid-session moment. The learner is in Phase 2 (foutenfocus), drilling hebt vs heeft, a tracked error at confidence 3:

Claude:

Drill on hebben conjugation. Replace the subject in Ik heb een broer with each pronoun below and adjust the verb form. Take all four cold, then feedback.

Vier achter elkaar: jij, hij, wij, jullie.

Learner:

Jij hebt een broer. Hij hebt een broer. Wij hebben een broer. Jullie hebben een broer.

Claude:

Three of four. Hij hebt should be Hij heeft. Third-person singular shifts the stem, which is the irregular bit that keeps catching this one. The other three landed cleanly.

Drie van vier. Hij heeft, niet Hij hebt. De rest klopte. We doen nog een ronde met andere onderwerpen:

zij

mijn vriend

de buren

The drill continues until the learner hits four-of-four cleanly and the confidence on this error ticks up. No translation step. No warm-up question. The session moves to the next pattern when the current one is automatic.

How a session works

A typical mid-program session, roughly 30 to 45 minutes:

Cold recall (5 to 7 min). Claude prompts 3 to 5 sentences pulling from the active chapter and recent material. No reference tables. No hints. Errors get noted privately for Phase 2.
Foutenfocus (5 to 10 min). Tracked errors below confidence 4 get re-drilled via substitution. The drill stops early if the learner hits 4 of 5 correct without hesitation.
New content (10 to 15 min). Next outstanding section gets introduced and drilled. 8 to 12 substitution reps.
Cumulative exercise (5 to 8 min). Short dialogue, paragraph completion, or guided composition combining the new section with at least one prior chapter.
Session-end ritual. The tracker updates progress.md, flags candidate memory additions, suggests skill edits if patterns have stabilized. Outputs a downloadable progress.md the learner swaps into the project for the next session.

Built for the sustainable end of "intense enough to retain" without crossing into burnout. Burnout signals get caught mid-session and trigger automatic pull-back: cancel new content, run foutenfocus and light recall only.

Why it exists

Built for the DUO inburgeringsexamen (the Dutch civic integration exam, administered by DUO, the government education service), required for naturalization. Existing apps (Duolingo, Babbel, Pingo) cover vocabulary and casual fluency. They don't drill exam-specific patterns, track persistent errors, or enforce mastery gates before advancing. They optimize for daily engagement, not for passing a structured exam under timed conditions.

This system is opinionated in the opposite direction. It assumes the learner has a real deadline, a real textbook, and limited weekly bandwidth. It picks the session focus instead of offering choices. It gates progress instead of celebrating streaks.

Levels and exam systems supported

The dutch-* skills are calibrated for DUO inburgering. Two levels are first-class:

A2. The default and active naturalization requirement as of 2026. All skill content, drill examples, and chapter references assume A2 unless overridden.
B1. Supported via a level switch in progress.md (Target level: B1). Same skill files, same four-phase session, same chapter-gate protocol. Vocabulary and frame complexity scope up: multi-clause sentences, perfect and conditional tenses, broader topical vocabulary, more idiomatic register. Suited for learners on the same DUO inburgering exam at the higher level, or A2 learners stretching ahead of the announced B1 transition.

NT2 Staatsexamen (Nederlands als Tweede Taal, Dutch-as-second-language, administered by CvTE rather than DUO) is a different exam body with different components (no KNM or ONA, recorded Spreken, two programs at B2 and C1) and is not yet supported by the dutch-* skill set. NT2 learners route to the polyglot language-tutor skill, which handles CEFR-aligned exam prep at B2 and C1 generally; see skills/language-tutor/references/exam-prep.md for format treatment. A dedicated dutch-staatsexamen-* skill set is on the roadmap and would parallel the dutch-* set's structure (orchestrator, drill generator, gate, tracker) calibrated to the CvTE format.

To switch levels or systems, edit Target level and Exam system in progress.md's Current state. Defaults are A2 and inburgering, so existing learners with legacy progress.md files are unaffected if those fields are absent.

Architecture

Four skills in an orchestrator-plus-specialists pattern, with a single state file as ground truth.

graph TD
    subgraph "Dutch exam-prep specialist"
        A[dutch-tutor-session<br/><i>Session orchestrator</i>]
        B[dutch-drill-generator<br/><i>Substitution drills</i>]
        C[dutch-chapter-gate<br/><i>Chapter audit</i>]
        D[dutch-progress-tracker<br/><i>Session-end ritual</i>]
        A -->|Phase 2 and 3| B
        A -->|Phase 4 chapter close| C
        A -->|Session end| D
    end

    subgraph "Polyglot generalist"
        L[language-tutor<br/><i>Any language, any task</i>]
        L -.-> R[references/<br/>11 reference files]
    end

    E[(progress.md<br/><i>State file</i>)]
    E -.read at start.-> A
    D -.write at end.-> E

    R1{{User intent<br/>routing}}
    R1 -->|Structured exam-prep| A
    R1 -->|General language task| L

Orchestrator: dutch-tutor-session. Reads state, picks session focus, runs the four phases. Hands off to specialists when their domain comes up. Never produces drill content directly, never decides chapter pass or fail, never updates state.

Specialist: dutch-drill-generator. Produces substitution drills. Maps tracked errors to drill types. Outputs structured drill prompts with explicit error triggers. Catalog of drill types covers conjugation, plurals, articles, inversion, time expressions, possessives, question formation, and negation.

Specialist: dutch-chapter-gate. Runs the five-check audit before advancing chapters: all sections taught, all sections drilled, tracked errors at confidence 4 or above, cold recall clean, cumulative exercise completed. Surfaces all gaps, doesn't stop at the first failure.

Specialist: dutch-progress-tracker. Runs the session-end ritual. Updates progress.md, flags candidate memory additions for approval, proposes skill edits when patterns stabilize across multiple sessions.

State: progress.md. Single source of truth. Tracks current chapter, tracked errors with confidence scores, pace metrics, schedule delta, session log, exam component readiness. Read at session start. Rewritten at session end. Personal: gitignored, never committed.

Repo layout

.
├── README.md
├── ROADMAP.md
├── project-prompt.md
├── progress.example.md
├── docs/
│   └── images/
│       ├── claude-ai-projects-landing.png
│       └── claude-ai-create-project-dialog.png
├── skills/
│   ├── dutch-tutor-session.md
│   ├── dutch-drill-generator.md
│   ├── dutch-chapter-gate.md
│   ├── dutch-progress-tracker.md
│   └── language-tutor/
│       ├── SKILL.md
│       └── references/  (11 progressive-disclosure reference files)
└── LICENSE

Design principles

Substitution drilling, not translation. Fix a sentence frame in the target language. Swap one variable per rep. 5 to 12 reps depending on whether it's an error re-drill or new content. Why: translation creates an English-first mental path that produces the "freeze when speaking" problem under exam conditions. Substitution drills bypass translation, building direct association between context and target-language output.

Cold recall first. No warm-up. Sessions open with exam-simulating prompts immediately. Why: tutor warm-ups inflate confidence and produce false readiness. The exam doesn't warm you up; the system shouldn't either.

Confidence before accuracy. Acknowledge what's correct before flagging errors. Never penalize hesitation. Confirm correct answers explicitly because learners frequently second-guess. Build the feel of capability first; precision follows.

Gate before advancing. No chapter passes until all five checks complete. Why: stacking new content on shaky fundamentals is the most common reason adult learners stall around the A2-B1 boundary (CEFR levels). The gate is non-negotiable except by explicit learner override, which is logged.

Pushback as quality control. If the learner flags a correction as wrong, the tutor re-examines before confirming or revising. Learner pushback is signal, not challenge. This protects against the LLM-overconfidence failure mode where Claude misreads a correct answer as incorrect.

Worked / didn't work logs. Every session log captures pedagogical learning, not just content learning. The skills themselves get refined when patterns stabilize across multiple sessions. The system improves with use.

Getting started

Prerequisites

A claude.ai account with Projects (Pro or higher; Projects is a paid-tier feature). If you're new to claude.ai Projects, Anthropic's help center has the setup basics under "Projects."
A textbook or reference materials appropriate to your target level
Any official exam materials you want Claude to know about

project-prompt.md configures tutor mode rules, learner context placeholders, exam scope reminders, and routing between the specialist and polyglot skills. It gets pasted into the project-level instructions field (currently labeled "What are you trying to achieve?") during setup.

What's in progress.example.md (click to expand)

The state schema you copy into your own progress.md before first session. First section shown below; full file at progress.example.md.

# Dutch DUO progress

> This is the empty schema. Copy this file to `progress.md` (which is gitignored) and fill in your specifics. The `dutch-progress-tracker` skill will maintain it from then on.

## Current state

- Target level: A2  (A2 | B1)
- Exam system: inburgering  (inburgering = DUO; staatsexamen = CvTE NT2, not yet supported by these skills)
- Phase: 1 (book mastery, [start date] to [Phase 1 end date])
- Active chapter: 1
- Last tutor session: [date of first session]
- Pace: on track
- Burnout signal: green
- Open external inquiry: (none)

The full file also has sections for chapter status, tracked errors with confidence scoring, pace metrics, schedule delta, component readiness, and a rolling session log. The dutch-progress-tracker skill rewrites this file at the end of each session.

Choose your track

Track	Level / exam	Status
A	Dutch A2 DUO inburgering	Validated paved path
B	Dutch B1 DUO inburgering	Plumbed; expect to adapt in early sessions
C	NT2 Staatsexamen (B2 or C1, CvTE)	Polyglot skill fallback; dedicated set planned

Track A: Dutch A2 DUO inburgering

The validated path. Skills, textbook references, and the progress.md schema were all built around this case.

Create a new claude.ai project.
Paste the contents of project-prompt.md into the project-level instructions field (currently labeled "What are you trying to achieve?"). Replace bracketed placeholders (start date, deadline, route, etc.); leave the A2 / DUO references unchanged.

And then in the dialog that opens, paste into the highlighted field:
Upload your textbook PDF (Nederlands in gang, Coutinho 3rd ed.) and any KNM source (Welkom in Nederland 4e ed., the official civic-knowledge textbook) to the project files.
Upload the four dutch-*.md files from skills/ (dutch-tutor-session.md, dutch-drill-generator.md, dutch-chapter-gate.md, dutch-progress-tracker.md) and the skills/language-tutor/ folder to the project files.
Copy progress.example.md to progress.md locally, fill in dates and targets, leave Target level: A2 and Exam system: inburgering as defaults, and upload it.
Start a new chat. Open with "Hallo" or similar. The orchestrator activates, reads progress.md, and picks the first session's focus. Cold recall will be light on session one (nothing to recall yet); most of the time goes to Chapter 1 introduction and the first substitution drill.

Track B: Dutch B1 DUO inburgering

Same skill set as Track A, set to B1. The structure generalizes but a few pieces aren't fully calibrated yet: drill examples in dutch-drill-generator.md are A2-flavored beyond the B1 frame examples, and dutch-chapter-gate.md hardcodes the Nederlands in gang section structure. Expect to flag your textbook to Claude in the first session and watch for misreferences.

Create a new claude.ai project (separate from any A2 project you have).
Paste the contents of project-prompt.md into the project-level instructions field (currently labeled "What are you trying to achieve?"). Update bracketed placeholders, change "A2" to "B1" in the project title and exam-scope paragraph, and replace Nederlands in gang references with your B1 textbook (common options: Nederlands in actie, De finale).
Upload your B1 textbook PDF to the project files.
Upload the four dutch-*.md files from skills/ and the skills/language-tutor/ folder.
Copy progress.example.md to progress.md, set Target level: B1 in Current state, replace the chapter status table with your textbook's structure, fill in dates and targets, and upload.
Start a new chat in Dutch. The orchestrator reads Target level: B1 and the level-scoping block in dutch-drill-generator.md raises frame complexity (multi-clause, perfect/conditional tense, broader vocabulary). New B1-specific error patterns populate as sessions reveal them.

Track C: NT2 Staatsexamen (CvTE, B2 or C1)

NT2 doesn't use the dutch-* skills. It uses the polyglot language-tutor skill, which handles CEFR-aligned exam prep at B2 / C1 generally. No four-phase orchestration, no chapter gate, no persistent progress.md. The polyglot skill calibrates each session and offers a handoff block at the end you can paste into the next session for continuity. A dedicated dutch-staatsexamen-* skill set parallel to the dutch-* set is on the roadmap.

Create a new claude.ai project. Do not reuse a DUO project; the routing rules in project-prompt.md default to the dutch-* skills and would misroute NT2 sessions.

Paste a short orientation into the project-level instructions field (currently labeled "What are you trying to achieve?"). No need for the full project-prompt.md here:

This project supports preparation for the NT2 Staatsexamen, administered by CvTE.
Target program: [Programma I (B2) | Programma II (C1)].
Components: Lezen, Luisteren, Schrijven, Spreken (recorded, not live).
Native language: [your native language].
Current level estimate: [B1 | B2 | C1].
Goals and timeline: [your specifics].

Use the language-tutor skill. Calibrate first, then teach. Be honest about NT2-specific format uncertainty.

Upload your NT2 textbook(s) and any official CvTE practice materials to the project files.
Upload the skills/language-tutor/ folder (including the references/ subfolder). Do not upload the dutch-* files; they would misroute on Dutch input.
Start a new chat. The polyglot skill's opening protocol calibrates level and goals, then routes to the appropriate mode (conversation, grammar, writing feedback, exam prep). For exam-focused sessions, ask for "exam-prep mode for NT2 Programma [I / II]" to load exam-prep.md deliberately. Save the handoff block from the end of each session and paste it into your next session's opening message to preserve continuity.

Two systems, one repo

This repo integrates two language-tutoring systems with different design philosophies, working in parallel.

The Dutch exam-prep specialist (four skills, orchestrator pattern) is opinionated, stateful, and narrow. It assumes a specific learner working through a specific textbook toward a specific exam. It picks session focus instead of asking, enforces gates instead of advancing on request, and tracks errors across sessions with confidence scoring. The four skills only know how to do one thing well: run a structured exam-prep program.

The polyglot language-tutor (one mega-skill, 11 reference files, progressive-disclosure pattern) is adaptive, stateless, and broad. It handles any language at any level for any task: conversation, grammar explanation, vocabulary, writing feedback, pronunciation, lesson planning, exam-prep guidance. It calibrates to the learner before teaching and routes fluidly between modes within a session.

The two were built independently by different learners with different needs. The DUO-specialist system grew out of needing structured daily practice toward a hard exam deadline. The polyglot system grew out of wanting a generalist tutor that could move between languages and tasks without losing pedagogical quality. They turned out to compose well: the specialist handles the structured exam-prep flow, the generalist handles everything else.

Where they overlap on philosophy (don't over-correct, treat learner pushback seriously, never confabulate grammar rules), the polyglot skill's reference files are the deeper treatment. The Dutch-specific skills cross-reference them where relevant.

Routing between the two is governed by rules in project-prompt.md, with a default-to-exam-prep priority for ambiguous cases. Both skill descriptions also reference each other to mitigate trigger collision at the orchestrator level.

Adapting to other languages

Caveat: this is the intended design. The Dutch DUO flow is the only configuration currently running daily; the steps below describe what the system is built to support for other exams, not what's been independently verified by a learner on a different exam.

The architecture is language-agnostic. To adapt for, say, German Goethe-Zertifikat or French DELF:

Replace textbook references in the skill files. Search for "Nederlands in gang" and replace with your textbook's name and section structure.
Replace exam references. Search for "DUO" and "inburgering"; replace with your exam authority and exam name.
Update the drill type catalog in dutch-drill-generator.md. The drill types (subject-verb conjugation, plural formation, articles, inversion, etc.) generalize across most European languages, but the frame examples and expected responses need translation.
Update the common errors catalog in dutch-drill-generator.md with errors typical to learners of your target language. Start empty if you'd rather; the system populates it through sessions.
Rename files for clarity (e.g., german-tutor-session.md). The orchestrator hand-offs reference filenames directly, so update those references too.
Update the project prompt at the top level: target language, exam authority, components, materials.

The four-skill orchestrator architecture, the four-phase session, the substitution methodology, the gate protocol, and the state schema all transfer unchanged.

Collaboration workflow

Caveat: the workflow below is how two learners on the same Dutch exam are currently sharing improvements. It has not been stress-tested with a larger group or with learners on different exams.

For learners who want to share-and-fork with another learner on the same exam:

One learner maintains the main repo. The other forks or accepts a collaborator invite.
Each maintains their own personal progress.md (gitignored).
Skill improvements flow back as pull requests to main. Both learners pull updates.
Pedagogical insights captured in the Worked and Didn't work log fields can be aggregated across both learners' sessions when proposing skill edits.

This is how the system was built: one learner running it daily, the same architecture being adopted by a second learner on the same exam path. Convergence on improvements happens at the skill level; personal state stays local.

Extending the system

The skill sets in this repo are designed to accept additional skills alongside them.

To add a new skill, drop the SKILL.md (or skill directory) into skills/, upload it to the claude.ai project alongside the existing files, and Claude will route to it based on the skill's description frontmatter. Skills coexist. The orchestrator does not need to know about every skill in the project for them to activate.

Useful directions for extension:

Language-specific skill sets for other exams (Goethe-Zertifikat, DELE, JLPT, TOPIK) sitting alongside the Dutch ones, sharing the state-plus-gates pattern.
Component-specific skills for individual exam parts (Lezen drills, KNM flashcards, Spreken practice with timed prompts).
An oefenexamen integration skill that scores DUO practice exams under timed conditions.
Spaced-repetition vocabulary skills sitting alongside the substitution-drilling core.

The four Dutch skills assume a single learner working through one structured textbook toward one exam. The architecture (orchestrator, specialists, state file, session ritual) generalizes to other learners, languages, and exams without modification. Skills that follow that pattern slot in cleanly. Skills that don't can still coexist as long as their triggers don't collide with the existing skills' triggers; if collisions appear, the project prompt's routing rules are the place to add disambiguation.

If a skill addition stabilizes in your fork and would serve other learners, send a PR back.

What's next

Bigger directions on the roadmap include programmatic exercise generation (a planned downstream BMAD-framework app), a Claude Code variant with file-watcher ingestion of textbook PDFs, exam-day timed mode, and a spaced-repetition layer alongside the substitution-drilling core. See ROADMAP.md for the phased plan and what's tracked behind it.

License

MIT. See LICENSE.

Acknowledgments

Built for one learner's run at the Dutch A2 DUO inburgeringsexamen, shared with a friend whose partner is on the same path. The language-tutor skill was contributed by Kriselda Neri, extending the repo from a single-exam specialist into a dual-system architecture.

Methodological influences: Pimsleur's substitution method, cold-recall protocols from language teaching research, the BMAD framework for the application layer that may follow, and Krashen's comprehensible-input theory (i+1) for the polyglot skill's calibration approach.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

claude-language-tutor

Contents

What this is

What a session looks like

How a session works

Why it exists

Levels and exam systems supported

Architecture

Repo layout

Design principles

Getting started

Prerequisites

Choose your track

Track A: Dutch A2 DUO inburgering

Track B: Dutch B1 DUO inburgering

Track C: NT2 Staatsexamen (CvTE, B2 or C1)

Two systems, one repo

Adapting to other languages

Collaboration workflow

Extending the system

What's next

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
docs/images		docs/images
skills		skills
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
progress.example.md		progress.example.md
project-prompt.md		project-prompt.md

Folders and files

Latest commit

History

Repository files navigation

claude-language-tutor

Contents

What this is

What a session looks like

How a session works

Why it exists

Levels and exam systems supported

Architecture

Repo layout

Design principles

Getting started

Prerequisites

Choose your track

Track A: Dutch A2 DUO inburgering

Track B: Dutch B1 DUO inburgering

Track C: NT2 Staatsexamen (CvTE, B2 or C1)

Two systems, one repo

Adapting to other languages

Collaboration workflow

Extending the system

What's next

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages