diff --git a/CHANGELOG.md b/CHANGELOG.md
index 2549542..8d911f7 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -6,6 +6,12 @@ All notable changes to this project are documented here. The format is based on
 
 ## [Unreleased]
 
+### Changed
+- Rule metadata (id, severity, title, message template, description, rationale) is now loaded from
+  `rules/pfmea_control_plan_rules.yaml` instead of being hardcoded; the detection logic stays in
+  Python. Validation behaviour is unchanged (same finding types, severities, score and verdict),
+  verified by behaviour-parity tests. ([#1](https://github.com/migmcc/quality-docs-validator/issues/1))
+
 ## [0.2.0] - 2026-06-21
 
 ### Added
diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md
index 1a2f88a..01b5459 100644
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@@ -70,12 +70,23 @@ Score starts at 100 and is reduced per finding: **critical −15**, **warning 
 conservative so warnings cannot dominate the verdict (false-positive protection, [DECISIONS.md](DECISIONS.md) D3).
 Full detail and the per-type rationale live in [FINDINGS.md](FINDINGS.md).
 
-## Rules: code vs. YAML
-For the MVP the six checks are implemented in `modules/pfmea_control_plan.py`. The
-`rules/pfmea_control_plan_rules.yaml` file is the single source of truth for each rule's **id and
-severity**, and a consistency test asserts the code and YAML never drift. Driving the check *logic*
-from YAML (a small rule-interpretation layer) is deferred to a later iteration — it was kept out of
-the hardening pass to avoid a rearchitecture.
+## Rules: metadata in YAML, evaluation in Python
+`rules/pfmea_control_plan_rules.yaml` is the **single source of truth for rule metadata** — each
+rule's `id`, `severity`, `title`, `message_template`, `description` and `rationale`. The loader
+(`rules.load_rule_specs()` / `parse_rule_specs()`) reads and validates it, failing clearly on a
+missing id, missing required field, invalid severity, duplicate id or an empty ruleset.
+
+The checker in `modules/pfmea_control_plan.py` reads that metadata — it builds each `Finding` with
+the severity and the formatted `message_template` from the YAML rather than hardcoding them. The
+deliberate split is:
+
+- **YAML → rule metadata** (what a rule is, how severe it is, how it reads).
+- **Python → rule evaluation** (the per-finding-type detection logic stays in the module).
+
+This is intentionally *not* a generic rule engine: the bespoke evaluation logic remains in code.
+A consistency test plus behaviour-parity tests (seeded example, clean case, warnings case) ensure
+the YAML and the code never drift and that finding types, severities, count, score, verdict, and
+the Markdown/JSON output are unchanged from v0.2.
 
 ## Known limitations (MVP)
 - **`.xlsx` only**; one worksheet is read per file (selectable by name via `--pfmea-sheet` /
diff --git a/docs/FINDINGS.md b/docs/FINDINGS.md
index 396abb6..4b9a66b 100644
--- a/docs/FINDINGS.md
+++ b/docs/FINDINGS.md
@@ -6,9 +6,11 @@ the rationale for each, and how findings are turned into a score and verdict.
 > All findings are **potential** inconsistencies for a human to judge. The tool makes no regulatory
 > or normative conformance claim and does not replace technical review.
 
-The authoritative metadata (id + severity) also lives in
-[`src/quality_docs_validator/rules/pfmea_control_plan_rules.yaml`](../src/quality_docs_validator/rules/pfmea_control_plan_rules.yaml);
-a test keeps the YAML and the code in sync. The checks themselves are implemented in
+The authoritative rule **metadata** (id, severity, title, message template, description, rationale)
+is the
+[`rules/pfmea_control_plan_rules.yaml`](../src/quality_docs_validator/rules/pfmea_control_plan_rules.yaml)
+file; the checker reads it instead of hardcoding these values, and consistency + parity tests keep
+the YAML and the code in sync. The **detection logic** for each finding type is implemented in
 [`modules/pfmea_control_plan.py`](../src/quality_docs_validator/modules/pfmea_control_plan.py).
 
 ## Matching
diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md
index 885ffc4..dcd2bfa 100644
--- a/docs/ROADMAP.md
+++ b/docs/ROADMAP.md
@@ -22,10 +22,23 @@ merged to `main`; the `v0.2.0` tag/release is a separate step.
 Still out of scope for v0.2: CSV input, configurable column mapping, HTML output, UI, AI, and any
 new document pairs.
 
-## v0.3+ — Rule engine & more modules (each independent of the core)
-- **YAML-driven rules** ([#1](https://github.com/migmcc/quality-docs-validator/issues/1)) — make the
-  YAML the source of the rule *logic*, not just its documented metadata. Moved out of v0.2 because it
-  is a rule-engine refactor and must keep exact finding-type parity.
+## v0.3 — planned (YAML rules as source of truth)
+Tracked under the [v0.3 milestone](https://github.com/migmcc/quality-docs-validator/milestone/2).
+Deliberately small and low-risk — **no new features, no behaviour change**:
+
+- **YAML-driven rule *metadata*** ([#1](https://github.com/migmcc/quality-docs-validator/issues/1)) —
+  make `rules/pfmea_control_plan_rules.yaml` the single source of truth for each rule's **id,
+  severity, title/message template, description and rationale**, and have
+  `modules/pfmea_control_plan.py` read that metadata instead of hardcoding it. The **evaluation
+  logic stays in Python**; we are *not* building a generic rule engine.
+- **Parity tests** — prove the synthetic examples, a clean case and a warnings case produce the
+  exact same finding types, severities, count, score and verdict as v0.2 (Markdown + JSON unchanged).
+- **Rule documentation** generated/kept in sync from the YAML metadata.
+
+Out of scope for v0.3: new document pairs, CSV, configurable mapping, UI, AI, PyPI, new scoring,
+fuzzy matching, JSON-schema changes, and any change to the finding types.
+
+## v0.4+ — More modules (each independent of the core)
 - Process Flow ↔ PFMEA consistency.
 - Control Plan ↔ Work Instructions.
 - PPAP gap check.
diff --git a/src/quality_docs_validator/modules/pfmea_control_plan.py b/src/quality_docs_validator/modules/pfmea_control_plan.py
index 8c1876e..d1ce264 100644
--- a/src/quality_docs_validator/modules/pfmea_control_plan.py
+++ b/src/quality_docs_validator/modules/pfmea_control_plan.py
@@ -1,10 +1,14 @@
 """PFMEA <-> Control Plan consistency checker (MVP module).
 
-This is the single module shipped in v0.1.0. It parses both documents, matches rows by operation,
-applies the explicit checks below and returns a scored `ValidationResult`. Each check is intentionally
-simple and documented; the tool surfaces *potential* findings for a human to judge.
+This is the single module shipped in v0.1. It parses both documents, matches rows by operation,
+applies the explicit checks below and returns a scored `ValidationResult`. Each check is
+intentionally simple and documented; the tool surfaces *potential* findings for a human to judge.
 
-Finding types implemented:
+Rule **metadata** (severity + message template) is loaded from
+`rules/pfmea_control_plan_rules.yaml`; the **detection logic** for each finding type stays in this
+module (this is not a generic rule engine).
+
+Finding types:
 - UNMATCHED_PROCESS_STEP              (warning)  operation present in one document only
 - MISSING_CONTROL                     (critical) matched operation has no control method
 - SPECIAL_CHARACTERISTIC_NOT_CONTROLLED (critical) PFMEA special char not marked in Control Plan
@@ -21,13 +25,9 @@
 from ..core.matching import MatchResult, match_rows
 from ..models import ControlPlanRow, Finding, PFMEARow, ValidationResult
 from ..parsers.excel import parse_control_plan, parse_pfmea
+from ..rules import load_rule_specs
 
 HIGH_SEVERITY_THRESHOLD = 8
-
-# Phrases that indicate a subjective / low-reliability detection method. Kept deliberately
-# specific (phrases, not bare words like "operator" or "manual") to limit false positives:
-# e.g. "manual gauge" or "operator runs CMM" are NOT weak, but "manual inspection" is.
-# Because these checks are the most false-positive-prone, both rules they feed are WARNINGS (D3).
 WEAK_METHOD_KEYWORDS = (
     "visual",
     "by eye",
@@ -39,6 +39,20 @@
     "manual check",
 )
 
+# Rule metadata (severity + message template) is the single source of truth in the YAML.
+_RULES = load_rule_specs()
+
+
+def _make(rule_id: str, op: str, **context: object) -> Finding:
+    """Build a Finding using the YAML metadata for severity and the message template."""
+    spec = _RULES[rule_id]
+    return Finding(
+        finding_type=rule_id,
+        level=spec["severity"],
+        operation_id=op,
+        message=spec["message_template"].format(op=op, **context),
+    )
+
 
 def _is_weak_method(method: str | None) -> bool:
     if not method:
@@ -68,69 +82,21 @@ def _check_operation(
     weak = any(_is_weak_method(m) for m in control_methods)
 
     if not has_control:
-        findings.append(
-            Finding(
-                finding_type="MISSING_CONTROL",
-                level="critical",
-                operation_id=op_label,
-                message=(
-                    f"Operation {op_label} has PFMEA failure mode(s) but no control method "
-                    f"in the Control Plan."
-                ),
-            )
-        )
+        findings.append(_make("MISSING_CONTROL", op_label))
 
     if pf_special and not cp_special:
-        findings.append(
-            Finding(
-                finding_type="SPECIAL_CHARACTERISTIC_NOT_CONTROLLED",
-                level="critical",
-                operation_id=op_label,
-                message=(
-                    f"Operation {op_label} is flagged as a special characteristic in the PFMEA "
-                    f"but is not marked/controlled as special in the Control Plan."
-                ),
-            )
-        )
+        findings.append(_make("SPECIAL_CHARACTERISTIC_NOT_CONTROLLED", op_label))
 
     if has_control and not has_reaction and max_sev is not None and max_sev >= HIGH_SEVERITY_THRESHOLD:
-        findings.append(
-            Finding(
-                finding_type="MISSING_REACTION_PLAN",
-                level="critical",
-                operation_id=op_label,
-                message=(
-                    f"Operation {op_label} has a high-severity failure mode (S={max_sev}) "
-                    f"but the Control Plan control has no reaction plan."
-                ),
-            )
-        )
+        findings.append(_make("MISSING_REACTION_PLAN", op_label, severity=max_sev))
 
     if weak:
         findings.append(
-            Finding(
-                finding_type="WEAK_DETECTION_METHOD",
-                level="warning",
-                operation_id=op_label,
-                message=(
-                    f"Operation {op_label} relies on a weak detection method "
-                    f"({', '.join(control_methods)})."
-                ),
-            )
+            _make("WEAK_DETECTION_METHOD", op_label, methods=", ".join(control_methods))
         )
 
     if weak and max_sev is not None and max_sev >= HIGH_SEVERITY_THRESHOLD:
-        findings.append(
-            Finding(
-                finding_type="HIGH_SEVERITY_WEAK_CONTROL",
-                level="warning",
-                operation_id=op_label,
-                message=(
-                    f"Operation {op_label} has a high-severity failure mode (S={max_sev}) "
-                    f"paired with a weak control method."
-                ),
-            )
-        )
+        findings.append(_make("HIGH_SEVERITY_WEAK_CONTROL", op_label, severity=max_sev))
 
     return findings
 
@@ -142,22 +108,12 @@ def evaluate(match: MatchResult) -> list[Finding]:
     for key in match.pfmea_only_ops:
         op = match.display_op(key)
         findings.append(
-            Finding(
-                finding_type="UNMATCHED_PROCESS_STEP",
-                level="warning",
-                operation_id=op,
-                message=f"PFMEA operation {op} has no matching row in the Control Plan.",
-            )
+            _make("UNMATCHED_PROCESS_STEP", op, source="PFMEA", target="Control Plan")
         )
     for key in match.control_only_ops:
         op = match.display_op(key)
         findings.append(
-            Finding(
-                finding_type="UNMATCHED_PROCESS_STEP",
-                level="warning",
-                operation_id=op,
-                message=f"Control Plan operation {op} has no matching row in the PFMEA.",
-            )
+            _make("UNMATCHED_PROCESS_STEP", op, source="Control Plan", target="PFMEA")
         )
 
     for key in match.matched_ops:
diff --git a/src/quality_docs_validator/rules/__init__.py b/src/quality_docs_validator/rules/__init__.py
index 31c8b63..720e20d 100644
--- a/src/quality_docs_validator/rules/__init__.py
+++ b/src/quality_docs_validator/rules/__init__.py
@@ -1,10 +1,9 @@
 """Rule definitions (YAML) for the validation modules.
 
-Packaged so rule files ship with the wheel. In the MVP the checks are implemented in
-``modules/pfmea_control_plan.py``; this YAML documents each finding type (id, severity,
-description) and is the single source of truth for that metadata. A consistency test asserts the
-code and the YAML never drift. Making the module *consume* this YAML to drive logic is deferred
-to a later iteration (it requires a small rule-interpretation layer).
+Packaged so rule files ship with the wheel. The YAML is the **single source of truth for rule
+metadata** — id, severity, title, message_template, description and rationale. The checker in
+``modules/pfmea_control_plan.py`` reads this metadata instead of hardcoding it, while the
+per-finding-type detection logic stays in Python (this is intentionally not a generic rule engine).
 """
 
 from __future__ import annotations
@@ -15,18 +14,49 @@
 
 _RULES_FILE = "pfmea_control_plan_rules.yaml"
 
+VALID_SEVERITIES = {"critical", "warning"}
+REQUIRED_FIELDS = ("severity", "message_template", "description")
 
-def load_rule_specs() -> dict[str, dict]:
-    """Load the documented rules as ``{rule_id: {"severity": ..., "description": ...}}``."""
-    text = resources.files(__package__).joinpath(_RULES_FILE).read_text(encoding="utf-8")
-    data = yaml.safe_load(text) or {}
+
+class RuleSpecError(ValueError):
+    """Raised when the rule metadata file is malformed."""
+
+
+def parse_rule_specs(data: dict) -> dict[str, dict]:
+    """Validate parsed YAML and return ``{rule_id: {severity, title, message_template, ...}}``.
+
+    Raises RuleSpecError on a missing id, missing required field, invalid severity, or duplicate id.
+    """
     specs: dict[str, dict] = {}
-    for rule in data.get("rules", []):
-        rule_id = rule.get("id")
-        if rule_id:
-            specs[rule_id] = {
-                "severity": rule.get("severity"),
-                "description": (rule.get("description") or "").strip(),
-            }
+    for rule in (data or {}).get("rules", []):
+        rule_id = (rule.get("id") or "").strip()
+        if not rule_id:
+            raise RuleSpecError("Rule with a missing or empty 'id'.")
+        if rule_id in specs:
+            raise RuleSpecError(f"Duplicate rule id: '{rule_id}'.")
+        for field in REQUIRED_FIELDS:
+            value = rule.get(field)
+            if value is None or (isinstance(value, str) and not value.strip()):
+                raise RuleSpecError(f"Rule '{rule_id}' is missing required field '{field}'.")
+        severity = rule["severity"]
+        if severity not in VALID_SEVERITIES:
+            raise RuleSpecError(
+                f"Rule '{rule_id}' has invalid severity '{severity}' "
+                f"(expected one of {sorted(VALID_SEVERITIES)})."
+            )
+        specs[rule_id] = {
+            "severity": severity,
+            "title": (rule.get("title") or "").strip(),
+            "message_template": " ".join(str(rule["message_template"]).split()),
+            "description": " ".join(str(rule["description"]).split()),
+            "rationale": " ".join(str(rule.get("rationale") or "").split()),
+        }
+    if not specs:
+        raise RuleSpecError("No rules found in the rule metadata file.")
     return specs
 
+
+def load_rule_specs() -> dict[str, dict]:
+    """Load and validate the rule metadata from the packaged YAML file."""
+    text = resources.files(__package__).joinpath(_RULES_FILE).read_text(encoding="utf-8")
+    return parse_rule_specs(yaml.safe_load(text))
diff --git a/src/quality_docs_validator/rules/pfmea_control_plan_rules.yaml b/src/quality_docs_validator/rules/pfmea_control_plan_rules.yaml
index 9dd8807..92cf580 100644
--- a/src/quality_docs_validator/rules/pfmea_control_plan_rules.yaml
+++ b/src/quality_docs_validator/rules/pfmea_control_plan_rules.yaml
@@ -1,45 +1,89 @@
 # PFMEA <-> Control Plan validation rules.
 #
-# These six finding types are IMPLEMENTED in src/quality_docs_validator/modules/pfmea_control_plan.py.
-# For now this file documents them (id, severity, description); a later iteration will make the
-# module consume this YAML directly so checks can be edited without touching Python.
+# This file is the SINGLE SOURCE OF TRUTH for each rule's metadata: id, severity, title,
+# message_template and description/rationale. The detection logic itself lives in
+# src/quality_docs_validator/modules/pfmea_control_plan.py, which reads this metadata
+# (severity + message_template) instead of hardcoding it. This is intentionally NOT a generic
+# rule engine — the per-finding-type checks stay in Python.
+#
+# message_template placeholders (filled by the checker, per rule):
+#   {op}        operation id (all rules)
+#   {source}    "PFMEA" / "Control Plan"        (UNMATCHED_PROCESS_STEP)
+#   {target}    counterpart document             (UNMATCHED_PROCESS_STEP)
+#   {severity}  max PFMEA severity for the op    (MISSING_REACTION_PLAN, HIGH_SEVERITY_WEAK_CONTROL)
+#   {methods}   weak control method text         (WEAK_DETECTION_METHOD)
 #
 # Severity classification keeps the riskiest, false-positive-prone checks as WARNINGS (DECISIONS.md D3).
 
-version: 0
+version: 1
 
 rules:
-  - id: MISSING_CONTROL
-    severity: critical
-    description: >
-      A PFMEA failure mode / cause has no corresponding control in the Control Plan
-      for the matched operation.
-
   - id: UNMATCHED_PROCESS_STEP
     severity: warning
+    title: Unmatched process step
+    message_template: "{source} operation {op} has no matching row in the {target}."
     description: >
       A process step / operation_id present in one document has no counterpart in the other
       (surfaced rather than silently dropped).
+    rationale: >
+      A step planned in one document but absent from the other is a coverage gap, but it is also a
+      common artefact of differing numbering, so it is a warning rather than a failure.
+
+  - id: MISSING_CONTROL
+    severity: critical
+    title: Missing control
+    message_template: >-
+      Operation {op} has PFMEA failure mode(s) but no control method in the Control Plan.
+    description: >
+      A PFMEA failure mode / cause has no corresponding control in the Control Plan
+      for the matched operation.
+    rationale: >
+      A failure mode with no control is exactly the kind of gap audits and field escapes punish.
 
   - id: SPECIAL_CHARACTERISTIC_NOT_CONTROLLED
     severity: critical
+    title: Special characteristic not controlled
+    message_template: >-
+      Operation {op} is flagged as a special characteristic in the PFMEA but is not
+      marked/controlled as special in the Control Plan.
     description: >
       A special characteristic identified in the PFMEA is not present / not controlled in the
       Control Plan's inspection plan.
+    rationale: >
+      Special characteristics carry mandatory control expectations; a mismatch is high-risk.
 
   - id: MISSING_REACTION_PLAN
     severity: critical
+    title: Missing reaction plan
+    message_template: >-
+      Operation {op} has a high-severity failure mode (S={severity}) but the Control Plan
+      control has no reaction plan.
     description: >
       A Control Plan control lacks a reaction plan where the matched PFMEA risk is high.
+    rationale: >
+      When risk is highest, the absence of a documented reaction plan is a serious gap.
 
   - id: WEAK_DETECTION_METHOD
     severity: warning   # warning by decision D3 (false-positive-prone)
+    title: Weak detection method
+    message_template: "Operation {op} relies on a weak detection method ({methods})."
     description: >
-      The Control Plan detection method appears weak relative to the PFMEA detection ranking.
+      The Control Plan detection method appears weak (e.g. visual / manual inspection) relative to
+      the PFMEA detection ranking.
+    rationale: >
+      Weak detection lets defects through, but the judgement is heuristic and template-sensitive,
+      so it ships as a warning.
 
   - id: HIGH_SEVERITY_WEAK_CONTROL
     severity: warning   # warning by decision D3 (false-positive-prone)
+    title: High severity paired with weak control
+    message_template: >-
+      Operation {op} has a high-severity failure mode (S={severity}) paired with a weak
+      control method.
     description: >
       A high-severity failure mode is paired with a control judged weak for that severity.
+    rationale: >
+      High severity + weak control is a priority to revisit; kept a warning for the same
+      false-positive reason.
 
 # Further finding types (to reach the 11 total) are defined as the engine matures.
diff --git a/tests/test_parity.py b/tests/test_parity.py
new file mode 100644
index 0000000..39b2cc0
--- /dev/null
+++ b/tests/test_parity.py
@@ -0,0 +1,95 @@
+"""Behaviour-parity tests for the YAML-metadata refactor (issue #1).
+
+These lock the v0.2 behaviour: finding types, severities, count, score, verdict and the JSON
+summary for three representative cases. They are written *before* the refactor and must keep
+passing after it, guaranteeing no regression. They assert structure (not exact message wording),
+so messages may be kept identical or improved.
+"""
+
+from __future__ import annotations
+
+from pathlib import Path
+
+from quality_docs_validator.core.report import build_report_data, render_markdown
+from quality_docs_validator.modules.pfmea_control_plan import check_files
+
+# Deterministic finding order produced by `evaluate` for the seeded examples.
+EXPECTED_EXAMPLE_FINDINGS = [
+    ("UNMATCHED_PROCESS_STEP", "warning", "40"),
+    ("SPECIAL_CHARACTERISTIC_NOT_CONTROLLED", "critical", "20"),
+    ("MISSING_REACTION_PLAN", "critical", "20"),
+    ("WEAK_DETECTION_METHOD", "warning", "20"),
+    ("HIGH_SEVERITY_WEAK_CONTROL", "warning", "20"),
+    ("MISSING_CONTROL", "critical", "30"),
+]
+
+
+def test_parity_seeded_example(example_files) -> None:
+    pfmea, control_plan = example_files
+    result = check_files(pfmea, control_plan)
+    assert result.verdict == "FAIL"
+    assert result.score == 43
+    assert len(result.findings) == 6
+    assert result.critical_count == 3
+    assert result.warning_count == 3
+    actual = [(f.finding_type, f.level, f.operation_id) for f in result.findings]
+    assert actual == EXPECTED_EXAMPLE_FINDINGS
+    # Every message is non-empty and references its operation.
+    for f in result.findings:
+        assert f.message
+        assert f.operation_id in f.message
+
+    # JSON summary parity.
+    data = build_report_data(result, "pfmea.xlsx", "control-plan.xlsx")
+    assert data["verdict"] == "FAIL"
+    assert data["score"] == 43
+    assert data["summary"]["total"] == 6
+    assert data["summary"]["by_severity"] == {"critical": 3, "warning": 3}
+    assert sum(data["summary"]["by_type"].values()) == 6
+
+    # Markdown still carries the essential elements.
+    md = render_markdown(result, "pfmea.xlsx", "control-plan.xlsx")
+    assert "Validation Report" in md
+    assert "FAIL" in md
+    for finding_type, _level, _op in EXPECTED_EXAMPLE_FINDINGS:
+        assert finding_type in md
+    assert "not a substitute for human" in md.lower()
+
+
+def test_parity_clean_case(make_xlsx, tmp_path: Path) -> None:
+    pfmea = make_xlsx(
+        tmp_path / "pf.xlsx",
+        ["Operation ID", "Failure Mode", "Severity", "Special Characteristic"],
+        [["10", "Leak", 6, "No"]],
+    )
+    cp = make_xlsx(
+        tmp_path / "cp.xlsx",
+        ["Operation ID", "Control Method", "Reaction Plan", "Special Characteristic"],
+        [["10", "Pressure gauge test", "Stop and rework", "No"]],
+    )
+    result = check_files(pfmea, cp)
+    assert result.verdict == "PASS"
+    assert result.score == 100
+    assert result.findings == []
+
+
+def test_parity_warnings_case(make_xlsx, tmp_path: Path) -> None:
+    # Low-severity op with a weak (visual) control + a reaction plan -> a single warning.
+    pfmea = make_xlsx(
+        tmp_path / "pf.xlsx",
+        ["Operation ID", "Failure Mode", "Severity", "Special Characteristic"],
+        [["10", "Scratch", 5, "No"]],
+    )
+    cp = make_xlsx(
+        tmp_path / "cp.xlsx",
+        ["Operation ID", "Control Method", "Reaction Plan", "Special Characteristic"],
+        [["10", "Visual inspection", "Rework", "No"]],
+    )
+    result = check_files(pfmea, cp)
+    assert result.verdict == "PASS-WITH-WARNINGS"
+    assert result.critical_count == 0
+    assert result.warning_count == 1
+    assert len(result.findings) == 1
+    assert result.findings[0].finding_type == "WEAK_DETECTION_METHOD"
+    assert result.findings[0].level == "warning"
+    assert result.score == 96
diff --git a/tests/test_rules_consistency.py b/tests/test_rules_consistency.py
index 8ba1d11..9563842 100644
--- a/tests/test_rules_consistency.py
+++ b/tests/test_rules_consistency.py
@@ -1,9 +1,15 @@
-"""Ensure the documented rules (YAML) and the implemented checks (code) never drift."""
+"""Ensure the rule metadata (YAML) and the implemented checks (code) never drift."""
 
 from __future__ import annotations
 
-from quality_docs_validator.modules.pfmea_control_plan import check_files
-from quality_docs_validator.rules import load_rule_specs
+import pytest
+
+from quality_docs_validator.modules.pfmea_control_plan import _RULES, check_files
+from quality_docs_validator.rules import (
+    RuleSpecError,
+    load_rule_specs,
+    parse_rule_specs,
+)
 
 EXPECTED_RULE_IDS = {
     "UNMATCHED_PROCESS_STEP",
@@ -20,9 +26,51 @@ def test_yaml_documents_exactly_the_known_rules() -> None:
     assert set(specs) == EXPECTED_RULE_IDS
     for rule_id, spec in specs.items():
         assert spec["severity"] in {"critical", "warning"}, rule_id
+        assert spec["message_template"], rule_id
         assert spec["description"], rule_id
 
 
+def test_every_rule_id_is_used_by_the_checker() -> None:
+    # No documented rule id is left unused, and the checker uses no undocumented id.
+    assert set(_RULES) == EXPECTED_RULE_IDS
+
+
+def test_loader_rejects_invalid_severity() -> None:
+    with pytest.raises(RuleSpecError, match="invalid severity"):
+        parse_rule_specs(
+            {"rules": [{"id": "X", "severity": "blocker", "message_template": "m", "description": "d"}]}
+        )
+
+
+def test_loader_rejects_duplicate_ids() -> None:
+    with pytest.raises(RuleSpecError, match="Duplicate rule id"):
+        parse_rule_specs(
+            {
+                "rules": [
+                    {"id": "X", "severity": "warning", "message_template": "m", "description": "d"},
+                    {"id": "X", "severity": "critical", "message_template": "m", "description": "d"},
+                ]
+            }
+        )
+
+
+def test_loader_rejects_missing_required_field() -> None:
+    with pytest.raises(RuleSpecError, match="missing required field 'severity'"):
+        parse_rule_specs({"rules": [{"id": "X", "message_template": "m", "description": "d"}]})
+
+
+def test_loader_rejects_missing_id() -> None:
+    with pytest.raises(RuleSpecError, match="missing or empty 'id'"):
+        parse_rule_specs(
+            {"rules": [{"severity": "warning", "message_template": "m", "description": "d"}]}
+        )
+
+
+def test_loader_rejects_empty_ruleset() -> None:
+    with pytest.raises(RuleSpecError, match="No rules"):
+        parse_rule_specs({"rules": []})
+
+
 def test_warning_rules_stay_warnings() -> None:
     # D3: these two false-positive-prone checks must remain warnings, not hard-fails.
     specs = load_rule_specs()