Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,12 @@ All notable changes to this project are documented here. The format is based on

## [Unreleased]

### Changed
- Rule metadata (id, severity, title, message template, description, rationale) is now loaded from
`rules/pfmea_control_plan_rules.yaml` instead of being hardcoded; the detection logic stays in
Python. Validation behaviour is unchanged (same finding types, severities, score and verdict),
verified by behaviour-parity tests. ([#1](https://github.com/migmcc/quality-docs-validator/issues/1))

## [0.2.0] - 2026-06-21

### Added
Expand Down
23 changes: 17 additions & 6 deletions docs/ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,12 +70,23 @@ Score starts at 100 and is reduced per finding: **critical −15**, **warning
conservative so warnings cannot dominate the verdict (false-positive protection, [DECISIONS.md](DECISIONS.md) D3).
Full detail and the per-type rationale live in [FINDINGS.md](FINDINGS.md).

## Rules: code vs. YAML
For the MVP the six checks are implemented in `modules/pfmea_control_plan.py`. The
`rules/pfmea_control_plan_rules.yaml` file is the single source of truth for each rule's **id and
severity**, and a consistency test asserts the code and YAML never drift. Driving the check *logic*
from YAML (a small rule-interpretation layer) is deferred to a later iteration — it was kept out of
the hardening pass to avoid a rearchitecture.
## Rules: metadata in YAML, evaluation in Python
`rules/pfmea_control_plan_rules.yaml` is the **single source of truth for rule metadata** — each
rule's `id`, `severity`, `title`, `message_template`, `description` and `rationale`. The loader
(`rules.load_rule_specs()` / `parse_rule_specs()`) reads and validates it, failing clearly on a
missing id, missing required field, invalid severity, duplicate id or an empty ruleset.

The checker in `modules/pfmea_control_plan.py` reads that metadata — it builds each `Finding` with
the severity and the formatted `message_template` from the YAML rather than hardcoding them. The
deliberate split is:

- **YAML → rule metadata** (what a rule is, how severe it is, how it reads).
- **Python → rule evaluation** (the per-finding-type detection logic stays in the module).

This is intentionally *not* a generic rule engine: the bespoke evaluation logic remains in code.
A consistency test plus behaviour-parity tests (seeded example, clean case, warnings case) ensure
the YAML and the code never drift and that finding types, severities, count, score, verdict, and
the Markdown/JSON output are unchanged from v0.2.

## Known limitations (MVP)
- **`.xlsx` only**; one worksheet is read per file (selectable by name via `--pfmea-sheet` /
Expand Down
8 changes: 5 additions & 3 deletions docs/FINDINGS.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,11 @@ the rationale for each, and how findings are turned into a score and verdict.
> All findings are **potential** inconsistencies for a human to judge. The tool makes no regulatory
> or normative conformance claim and does not replace technical review.

The authoritative metadata (id + severity) also lives in
[`src/quality_docs_validator/rules/pfmea_control_plan_rules.yaml`](../src/quality_docs_validator/rules/pfmea_control_plan_rules.yaml);
a test keeps the YAML and the code in sync. The checks themselves are implemented in
The authoritative rule **metadata** (id, severity, title, message template, description, rationale)
is the
[`rules/pfmea_control_plan_rules.yaml`](../src/quality_docs_validator/rules/pfmea_control_plan_rules.yaml)
file; the checker reads it instead of hardcoding these values, and consistency + parity tests keep
the YAML and the code in sync. The **detection logic** for each finding type is implemented in
[`modules/pfmea_control_plan.py`](../src/quality_docs_validator/modules/pfmea_control_plan.py).

## Matching
Expand Down
21 changes: 17 additions & 4 deletions docs/ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,23 @@ merged to `main`; the `v0.2.0` tag/release is a separate step.
Still out of scope for v0.2: CSV input, configurable column mapping, HTML output, UI, AI, and any
new document pairs.

## v0.3+ — Rule engine & more modules (each independent of the core)
- **YAML-driven rules** ([#1](https://github.com/migmcc/quality-docs-validator/issues/1)) — make the
YAML the source of the rule *logic*, not just its documented metadata. Moved out of v0.2 because it
is a rule-engine refactor and must keep exact finding-type parity.
## v0.3 — planned (YAML rules as source of truth)
Tracked under the [v0.3 milestone](https://github.com/migmcc/quality-docs-validator/milestone/2).
Deliberately small and low-risk — **no new features, no behaviour change**:

- **YAML-driven rule *metadata*** ([#1](https://github.com/migmcc/quality-docs-validator/issues/1)) —
make `rules/pfmea_control_plan_rules.yaml` the single source of truth for each rule's **id,
severity, title/message template, description and rationale**, and have
`modules/pfmea_control_plan.py` read that metadata instead of hardcoding it. The **evaluation
logic stays in Python**; we are *not* building a generic rule engine.
- **Parity tests** — prove the synthetic examples, a clean case and a warnings case produce the
exact same finding types, severities, count, score and verdict as v0.2 (Markdown + JSON unchanged).
- **Rule documentation** generated/kept in sync from the YAML metadata.

Out of scope for v0.3: new document pairs, CSV, configurable mapping, UI, AI, PyPI, new scoring,
fuzzy matching, JSON-schema changes, and any change to the finding types.

## v0.4+ — More modules (each independent of the core)
- Process Flow ↔ PFMEA consistency.
- Control Plan ↔ Work Instructions.
- PPAP gap check.
Expand Down
104 changes: 30 additions & 74 deletions src/quality_docs_validator/modules/pfmea_control_plan.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,14 @@
"""PFMEA <-> Control Plan consistency checker (MVP module).

This is the single module shipped in v0.1.0. It parses both documents, matches rows by operation,
applies the explicit checks below and returns a scored `ValidationResult`. Each check is intentionally
simple and documented; the tool surfaces *potential* findings for a human to judge.
This is the single module shipped in v0.1. It parses both documents, matches rows by operation,
applies the explicit checks below and returns a scored `ValidationResult`. Each check is
intentionally simple and documented; the tool surfaces *potential* findings for a human to judge.

Finding types implemented:
Rule **metadata** (severity + message template) is loaded from
`rules/pfmea_control_plan_rules.yaml`; the **detection logic** for each finding type stays in this
module (this is not a generic rule engine).

Finding types:
- UNMATCHED_PROCESS_STEP (warning) operation present in one document only
- MISSING_CONTROL (critical) matched operation has no control method
- SPECIAL_CHARACTERISTIC_NOT_CONTROLLED (critical) PFMEA special char not marked in Control Plan
Expand All @@ -21,13 +25,9 @@
from ..core.matching import MatchResult, match_rows
from ..models import ControlPlanRow, Finding, PFMEARow, ValidationResult
from ..parsers.excel import parse_control_plan, parse_pfmea
from ..rules import load_rule_specs

HIGH_SEVERITY_THRESHOLD = 8

# Phrases that indicate a subjective / low-reliability detection method. Kept deliberately
# specific (phrases, not bare words like "operator" or "manual") to limit false positives:
# e.g. "manual gauge" or "operator runs CMM" are NOT weak, but "manual inspection" is.
# Because these checks are the most false-positive-prone, both rules they feed are WARNINGS (D3).
WEAK_METHOD_KEYWORDS = (
"visual",
"by eye",
Expand All @@ -39,6 +39,20 @@
"manual check",
)

# Rule metadata (severity + message template) is the single source of truth in the YAML.
_RULES = load_rule_specs()


def _make(rule_id: str, op: str, **context: object) -> Finding:
"""Build a Finding using the YAML metadata for severity and the message template."""
spec = _RULES[rule_id]
return Finding(
finding_type=rule_id,
level=spec["severity"],
operation_id=op,
message=spec["message_template"].format(op=op, **context),
)


def _is_weak_method(method: str | None) -> bool:
if not method:
Expand Down Expand Up @@ -68,69 +82,21 @@ def _check_operation(
weak = any(_is_weak_method(m) for m in control_methods)

if not has_control:
findings.append(
Finding(
finding_type="MISSING_CONTROL",
level="critical",
operation_id=op_label,
message=(
f"Operation {op_label} has PFMEA failure mode(s) but no control method "
f"in the Control Plan."
),
)
)
findings.append(_make("MISSING_CONTROL", op_label))

if pf_special and not cp_special:
findings.append(
Finding(
finding_type="SPECIAL_CHARACTERISTIC_NOT_CONTROLLED",
level="critical",
operation_id=op_label,
message=(
f"Operation {op_label} is flagged as a special characteristic in the PFMEA "
f"but is not marked/controlled as special in the Control Plan."
),
)
)
findings.append(_make("SPECIAL_CHARACTERISTIC_NOT_CONTROLLED", op_label))

if has_control and not has_reaction and max_sev is not None and max_sev >= HIGH_SEVERITY_THRESHOLD:
findings.append(
Finding(
finding_type="MISSING_REACTION_PLAN",
level="critical",
operation_id=op_label,
message=(
f"Operation {op_label} has a high-severity failure mode (S={max_sev}) "
f"but the Control Plan control has no reaction plan."
),
)
)
findings.append(_make("MISSING_REACTION_PLAN", op_label, severity=max_sev))

if weak:
findings.append(
Finding(
finding_type="WEAK_DETECTION_METHOD",
level="warning",
operation_id=op_label,
message=(
f"Operation {op_label} relies on a weak detection method "
f"({', '.join(control_methods)})."
),
)
_make("WEAK_DETECTION_METHOD", op_label, methods=", ".join(control_methods))
)

if weak and max_sev is not None and max_sev >= HIGH_SEVERITY_THRESHOLD:
findings.append(
Finding(
finding_type="HIGH_SEVERITY_WEAK_CONTROL",
level="warning",
operation_id=op_label,
message=(
f"Operation {op_label} has a high-severity failure mode (S={max_sev}) "
f"paired with a weak control method."
),
)
)
findings.append(_make("HIGH_SEVERITY_WEAK_CONTROL", op_label, severity=max_sev))

return findings

Expand All @@ -142,22 +108,12 @@ def evaluate(match: MatchResult) -> list[Finding]:
for key in match.pfmea_only_ops:
op = match.display_op(key)
findings.append(
Finding(
finding_type="UNMATCHED_PROCESS_STEP",
level="warning",
operation_id=op,
message=f"PFMEA operation {op} has no matching row in the Control Plan.",
)
_make("UNMATCHED_PROCESS_STEP", op, source="PFMEA", target="Control Plan")
)
for key in match.control_only_ops:
op = match.display_op(key)
findings.append(
Finding(
finding_type="UNMATCHED_PROCESS_STEP",
level="warning",
operation_id=op,
message=f"Control Plan operation {op} has no matching row in the PFMEA.",
)
_make("UNMATCHED_PROCESS_STEP", op, source="Control Plan", target="PFMEA")
)

for key in match.matched_ops:
Expand Down
62 changes: 46 additions & 16 deletions src/quality_docs_validator/rules/__init__.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
"""Rule definitions (YAML) for the validation modules.

Packaged so rule files ship with the wheel. In the MVP the checks are implemented in
``modules/pfmea_control_plan.py``; this YAML documents each finding type (id, severity,
description) and is the single source of truth for that metadata. A consistency test asserts the
code and the YAML never drift. Making the module *consume* this YAML to drive logic is deferred
to a later iteration (it requires a small rule-interpretation layer).
Packaged so rule files ship with the wheel. The YAML is the **single source of truth for rule
metadata** — id, severity, title, message_template, description and rationale. The checker in
``modules/pfmea_control_plan.py`` reads this metadata instead of hardcoding it, while the
per-finding-type detection logic stays in Python (this is intentionally not a generic rule engine).
"""

from __future__ import annotations
Expand All @@ -15,18 +14,49 @@

_RULES_FILE = "pfmea_control_plan_rules.yaml"

VALID_SEVERITIES = {"critical", "warning"}
REQUIRED_FIELDS = ("severity", "message_template", "description")

def load_rule_specs() -> dict[str, dict]:
"""Load the documented rules as ``{rule_id: {"severity": ..., "description": ...}}``."""
text = resources.files(__package__).joinpath(_RULES_FILE).read_text(encoding="utf-8")
data = yaml.safe_load(text) or {}

class RuleSpecError(ValueError):
"""Raised when the rule metadata file is malformed."""


def parse_rule_specs(data: dict) -> dict[str, dict]:
"""Validate parsed YAML and return ``{rule_id: {severity, title, message_template, ...}}``.

Raises RuleSpecError on a missing id, missing required field, invalid severity, or duplicate id.
"""
specs: dict[str, dict] = {}
for rule in data.get("rules", []):
rule_id = rule.get("id")
if rule_id:
specs[rule_id] = {
"severity": rule.get("severity"),
"description": (rule.get("description") or "").strip(),
}
for rule in (data or {}).get("rules", []):
rule_id = (rule.get("id") or "").strip()
if not rule_id:
raise RuleSpecError("Rule with a missing or empty 'id'.")
if rule_id in specs:
raise RuleSpecError(f"Duplicate rule id: '{rule_id}'.")
for field in REQUIRED_FIELDS:
value = rule.get(field)
if value is None or (isinstance(value, str) and not value.strip()):
raise RuleSpecError(f"Rule '{rule_id}' is missing required field '{field}'.")
severity = rule["severity"]
if severity not in VALID_SEVERITIES:
raise RuleSpecError(
f"Rule '{rule_id}' has invalid severity '{severity}' "
f"(expected one of {sorted(VALID_SEVERITIES)})."
)
specs[rule_id] = {
"severity": severity,
"title": (rule.get("title") or "").strip(),
"message_template": " ".join(str(rule["message_template"]).split()),
"description": " ".join(str(rule["description"]).split()),
"rationale": " ".join(str(rule.get("rationale") or "").split()),
}
if not specs:
raise RuleSpecError("No rules found in the rule metadata file.")
return specs


def load_rule_specs() -> dict[str, dict]:
"""Load and validate the rule metadata from the packaged YAML file."""
text = resources.files(__package__).joinpath(_RULES_FILE).read_text(encoding="utf-8")
return parse_rule_specs(yaml.safe_load(text))
Loading
Loading