From 82712a6fe4d791f46813867f9251057f33a3b72a Mon Sep 17 00:00:00 2001 From: davidlabianca Date: Wed, 27 May 2026 21:40:50 +0000 Subject: [PATCH 1/9] This commit lands ADR-028 (prose-linter bracket-matching with Token contract raise) as Status: Draft, with ADR-017 D5 + D1 amendments MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Supersedes three accreted whitespace-adjacency-heuristic commits on the archived feature/353-c4-followons branch (a71ae47, 817e00d, 5e72a2a) and the stashed Path B derived-methods restructure attempt; both attempts left the emphasis-shape decision split between the tokenizer (greedy regex side effects) and the linter (regex/string-operation recovery). ADR-028 D1-D4 raise the Token contract to ADR grade; D5 specifies a depth-counter bracket-matching pass over the new shape field; D6 locks diagnostic-format preservation; D7 captures the two ADR-017 amendments and four one-way cross-references. Analytical input: working-plans/028-prose-linter-planning-inventory.md (21 locked decisions). D-Open-21 (underscore-italic intraword tightening) is in-scope per §8 but gated by Spike S3 (live-corpus diff probe) before maintainer flips Status: Draft -> Accepted. Co-authored-by: AI Assistant --- docs/adr/017-yaml-prose-authoring-subset.md | 4 + ...se-linter-bracket-matching-architecture.md | 212 ++++++++++++++++++ docs/adr/README.md | 1 + 3 files changed, 217 insertions(+) create mode 100644 docs/adr/028-prose-linter-bracket-matching-architecture.md diff --git a/docs/adr/017-yaml-prose-authoring-subset.md b/docs/adr/017-yaml-prose-authoring-subset.md index 5c06b987..12898ca3 100644 --- a/docs/adr/017-yaml-prose-authoring-subset.md +++ b/docs/adr/017-yaml-prose-authoring-subset.md @@ -38,6 +38,8 @@ Authors may use exactly these forms in any prose field. Everything else is rejec Bold and italic may compose (`**emphatically *not* this**` is valid). Sentinels are atomic identifier tokens; they do not nest into bold or italic. An author who wants the rendered title to appear bold relies on the renderer's stylesheet, not on wrapping `**` around `{{}}`. +The canonical mechanism the lint uses to enforce the "no nested same-family emphasis" rule and the "no emphasis-wrapped sentinel" rule is the depth-counter bracket-matching pass specified in [ADR-028](028-prose-linter-bracket-matching-architecture.md) D5. + The subset operates on the string contents of each prose paragraph. Paragraph and hard-break shape is carried by the YAML *array* structure ([ADR-011](011-persona-site-data-schema-contract.md) `definitions/prose`); prose strings are not list-bearing. ### D2. Disallowed by construction @@ -106,6 +108,8 @@ The two hooks have overlapping rejection sets (raw `` blocked by both, bare c If ADR-016 lands its hook before this one, the bare-camelCase and raw-`` checks live in `validate_prose_references.py` until ADR-017's hook ships and the shared tokenizer is extracted. The end state is the two-hook-shared-tokenizer split above. +The shared tokenizer's Token contract — the `Token` NamedTuple structure (including the emphasis-shape field), the `TokenKind` enumeration, the tokenizer's emission invariants, and the consumer surface — is formally specified in [ADR-028](028-prose-linter-bracket-matching-architecture.md) D1-D4. ADR-017 owns the grammar's authoring rules; ADR-028 owns the contract every consumer of the shared tokenizer reads. + ### D6. Redistribution contract surface Per [ADR-014](014-yaml-content-security-posture.md) P5, the framework guarantees "shape via schemas" to downstream consumers. ADR-017 is the canonical statement of "content within prose strings." After the conformance sweep closes, the contract becomes strictly: **YAML prose contains no URLs at all.** Every URL lives in a structured `externalReferences` entry (ADR-016) and is referenced from prose by sentinel. This is a stronger guarantee than "YAML prose contains some URLs you must sanitize" — a third-party redistributor parsing the YAML knows that any URL it ingests came through a typed, schema-validated structured field. diff --git a/docs/adr/028-prose-linter-bracket-matching-architecture.md b/docs/adr/028-prose-linter-bracket-matching-architecture.md new file mode 100644 index 00000000..e5c4e44f --- /dev/null +++ b/docs/adr/028-prose-linter-bracket-matching-architecture.md @@ -0,0 +1,212 @@ +# ADR-028: Prose-linter emphasis enforcement via bracket-matching depth pass over a Token-shape contract + +**Status:** Draft +**Date:** 2026-05-27 +**Authors:** Architect agent, with maintainer review + +--- + +## Context + +[ADR-017](017-yaml-prose-authoring-subset.md) D1 admits one nesting level for `**bold**`, `*italic*`, and `_italic_`; nesting another same-family emphasis inside is rejected. D5 commits to a single shared tokenizer at `scripts/hooks/precommit/_prose_tokens.py` consumed by both `validate-yaml-prose-subset` (ADR-017) and `validate_prose_references.py` ([ADR-016](016-reference-strategy.md) D6). The grammar lives in ADR-017; the enforcement mechanism for the "one nesting level" rule does not. + +The mechanism implemented on the archive branch split the emphasis-shape decision across two layers. The tokenizer's non-greedy bold and italic regexes produce delimiter-bounded spans whose interior whitespace edges carry the structural signal — `**foo **` retains a trailing space because the regex closed at what the author intended as an inner open. The linter then runs a second set of regex constants against each emphasis token's `value` to recover that signal. The shape decision is encoded as a side effect at one layer and decoded at the other, with the load-bearing knowledge ("whitespace-at-edge means greedy early-close") implicit in both. The two layers must stay in sync, and a reader of either alone cannot tell that the synchronization is load-bearing. + +This split is the architectural problem ADR-028 addresses. Attempts to restructure the linter without first raising the contract failed for three reasons that are properties of the design surface, not of any specific implementation. First, a token stream that carries no structural open/close events cannot support a stack-based or counter-based walk: any algorithm that treats every emphasis token as a "push" degenerates to faux-depth, because sibling complete-emphasis spans separated by plain text (`**hello** world **goodbye**`) become indistinguishable from one emphasis nested inside another. Second, the whitespace-at-edge heuristic can be relocated freely — from a linter-side regex constant to a derived method, an inline string check, a comment, anywhere — without ever being eliminated; the load-bearing knowledge that "trailing space inside the delimiter means greedy early-close" still has to live somewhere, and whichever layer hosts it carries the synchronization burden. Third, any shape information that is meaningful only for some token kinds produces an asymmetric API (`str | None` returns, value-passthrough fallbacks, branching on kind before reading shape), which forces every consumer to handle a "doesn't apply" case at every read site. All three failure modes resolve to the same root cause: the `Token` contract under-specifies what consumers may assume about shape, so each refactor has to re-establish invariants that should already exist on the token. Without raising the contract, the restructure surface stays load-bearing. + +ADR-028 specifies the contract this surface needs. The `Token` NamedTuple structure, the `TokenKind` enumeration, the invariants every token stream from `tokenize()` satisfies, and the consumer surface every reader accesses are documented here as ADR-grade architecture rather than implementation detail; emphasis-shape classification is part of the contract, carried on the token rather than reconstructed downstream. The full planning inventory — alternatives, consumer audit, fixture corpus, cross-ADR alignment matrix, shape-determination rules, and 21 locked decision points — was produced at `working-plans/028-prose-linter-planning-inventory.md` (local, untracked) and is the analytical input. + +Line references in this ADR are GitHub permalinks pinned against `upstream/main` at commit `7320136`. + +## Decision + +The tokenizer emits `Token` instances carrying a `shape` field classified at emission time; the prose-subset linter walks tokens once with a depth counter and reads `shape` directly. The decision has seven components. + +### D1. Token contract + +The `Token` NamedTuple at `scripts/hooks/precommit/_prose_tokens.py` is the durable contract between the tokenizer and every consumer. Fields, in declaration order: + +| Field | Type | Semantics | +|---|---|---| +| `kind` | `TokenKind` | Token classification per D2. | +| `value` | `str` | Exact substring from the input the token covers. For INVALID_* kinds, the offending substring. | +| `shape` | `Literal["complete", "open", "close", "neutral"]` | Emphasis classification per D3. `"neutral"` for every non-emphasis token. Default `"neutral"`. | + +`shape` is declared as the **third** field with a default of `"neutral"`. Two-positional construction (`Token(TokenKind.X, value)`) continues to compile and yields a token with `shape="neutral"`; emission sites that classify emphasis pass `shape=` as a keyword argument. + +Equality is structural NamedTuple equality. Two tokens with identical `kind` and `value` but different `shape` values are not equal. Test code that constructs tokens manually for comparison against `tokenize()` output must either supply the matching `shape` or compare via the fixture-format projection that drops `shape` (see D4 and the fixture migration in Follow-up). + +The NamedTuple structure, defaults, and equality semantics are part of the ADR-grade specification rather than implementation detail. Future field additions (e.g., a hypothetical `source_line`) require a new ADR or an amendment here; new `shape` values likewise. + +### D2. TokenKind enumeration + +The tokenizer emits exactly the following sixteen kinds. The accept/reject column matches `_REJECTED_KINDS` in [`validate_yaml_prose_subset.py:49-62`](https://github.com/cosai-oasis/secure-ai-tooling/blob/7320136/scripts/hooks/precommit/validate_yaml_prose_subset.py#L49-L62) with one carve-out documented after the table. + +| TokenKind | Defining regex / character class | Accept / Reject | Semantic meaning | Defining ADR | +|---|---|---|---|---| +| `BOLD` | `\*\*(.+?)\*\*` (`re.DOTALL`) | Accept | One nesting level bold; non-greedy close. Italic inside permitted; nested `**bold**` rejected by D5. | [ADR-017](017-yaml-prose-authoring-subset.md) D1 | +| `ITALIC` | `\*(.+?)\*` or whitespace-flanked `_(.+?)_` | Accept | One nesting level italic. Two delimiters so authors can italicize text containing the other delimiter. | [ADR-017](017-yaml-prose-authoring-subset.md) D1 | +| `SENTINEL_INTRA` | inner: `(risk\|control\|component\|persona)[A-Z]\w*` | Accept | Intra-document reference sentinel. | [ADR-016](016-reference-strategy.md) D2 | +| `SENTINEL_REF` | inner: `ref:[A-Za-z0-9_.\-]+` | Accept | External-reference sentinel. | [ADR-016](016-reference-strategy.md) D2 | +| `TEXT` | catch-all run accumulator | Accept | Plain prose. | [ADR-017](017-yaml-prose-authoring-subset.md) D1 | +| `INVALID_HTML` | `<[A-Za-z/][^>]*>` | Reject | Any HTML tag. | [ADR-017](017-yaml-prose-authoring-subset.md) D2 | +| `INVALID_URL` | scheme-with-authority, opaque-data scheme, or markdown link | Reject | Inline URL in any form. | [ADR-017](017-yaml-prose-authoring-subset.md) D2 / D4 rule 2 | +| `INVALID_HEADING` | `#+[^\n]*` at line start | Reject | Markdown heading. | [ADR-017](017-yaml-prose-authoring-subset.md) D2 | +| `INVALID_LIST` | `-\s`, `\*\s`, `\d+\.\s` at column 0 | Reject | Markdown list marker at column 0. | [ADR-017](017-yaml-prose-authoring-subset.md) D2 | +| `INVALID_CODE` | ```` ```...``` ```` (`re.DOTALL`) or `` `[^`]+` `` | Reject | Fenced or inline code. | [ADR-017](017-yaml-prose-authoring-subset.md) D2 | +| `INVALID_IMAGE` | `!\[[^\]]*\]\([^)]*\)` | Reject | Markdown image. | [ADR-017](017-yaml-prose-authoring-subset.md) D2 | +| `INVALID_BLOCKQUOTE` | `>[^\n]*` at line start | Reject | Markdown blockquote. | [ADR-017](017-yaml-prose-authoring-subset.md) D2 | +| `INVALID_TABLE` | `\|[^\n]*(?:\n\|$)` at line start | Reject | Markdown pipe-table row. | [ADR-017](017-yaml-prose-authoring-subset.md) D2 | +| `INVALID_FOLDED_BULLET` | `\s+\-\s[^\n]*(?:\n\|$)` at line start | Reject | Folded-bullet drift — leading-whitespace `- ` line. | [ADR-020](020-controls-schema.md) follow-up (citation defect — see D7) | +| `INVALID_CAMELCASE_ID` | `(risk\|control\|component\|persona)[A-Z]\w*` | Reject (**delegated**) | Bare entity-prefix camelCase outside a sentinel. | [ADR-017](017-yaml-prose-authoring-subset.md) D4 rule 5 / [ADR-016](016-reference-strategy.md) D6 | +| `INVALID_SENTINEL` | `\{\{ ... \}\}` with brace-depth scan, inner fails both sentinel forms | Reject | Structurally well-formed `{{ }}` with content matching neither sentinel grammar. | [ADR-016](016-reference-strategy.md) D2 | + +**Delegation carve-out.** `INVALID_CAMELCASE_ID` is the only rejecting kind that the prose-subset linter intentionally excludes from `_REJECTED_KINDS`. Ownership lives in `validate_prose_references.py` per ADR-017 D4 rule 5 and ADR-016 D6. A consumer that iterates `if kind.name.startswith("INVALID")` to count rejections will diverge from the prose-subset linter's behavior unless it applies the same exclusion. The carve-out is load-bearing for both consumers and any third-party reader; it is part of the contract. + +The enumeration is closed for ADR-028. Adding a new kind requires an amendment here. + +### D3. Tokenizer emission invariants + +Every token stream produced by `tokenize()` satisfies four invariants. + +1. **Partition-of-input.** `"".join(t.value for t in tokenize(text)) == text` for every input. Every character lands in exactly one token's `value`. Asserted in the test corpus at [`test_prose_tokens.py:1043-1072`](https://github.com/cosai-oasis/secure-ai-tooling/blob/7320136/scripts/hooks/tests/test_prose_tokens.py#L1043-L1072) (`TestMixedRuns.test_tokens_cover_full_input`). +2. **Rule-precedence ordering.** The sixteen rules apply per-character in the precedence order documented at [`_prose_tokens.py:11-27`](https://github.com/cosai-oasis/secure-ai-tooling/blob/7320136/scripts/hooks/precommit/_prose_tokens.py#L11-L27). Higher-priority match wins; ties do not exist because each rule anchors on a distinct character or substring. Line-anchored rules (headings, list markers, blockquotes, pipe-tables, folded-bullet drift) fire only at index 0 or immediately after `\n` per `at_line_start()`. +3. **Greedy / non-greedy specification.** Bold non-greedy on close; italic non-greedy on close — asterisk italic is intraword-permissive, underscore italic additionally requires whitespace-or-boundary flanking per ADR-017 D1, so an intraword `\S_\S` underscore does not qualify as a delimiter and the run consumes as TEXT; URL regexes greedy with bracketed exclusion `[^\s{]+` so a URL adjacent to a `{{` sentinel terminates at the brace boundary; fenced code non-greedy across newlines with `re.DOTALL`; sentinel scan brace-depth-aware so `{{id{{ref:x}}}}` consumes as one `INVALID_SENTINEL` token. Unclosed `{{` returns `None` from `_match_sentinel`; the caller emits the remainder as `TEXT`. +4. **Shape classification at emission.** Every emphasis token (`BOLD`, `ITALIC`) carries a `shape` value classified at emission time from leading/trailing whitespace in the matched span's interior. Every non-emphasis token carries `shape="neutral"`. The classification rules per delimiter (`**`, `*`, `_`): + +| Matched span shape | `shape` value | Rationale | +|---|---|---| +| `**foo**` — neither edge whitespace in interior | `"complete"` | Well-formed emphasis. | +| `**foo bar**` — internal whitespace only | `"complete"` | Only **edge** whitespace is the signal. | +| `**foo **` — trailing whitespace before close | `"open"` | Greedy non-greedy closed on what the author intended as an inner open. | +| `** bar**` — leading whitespace after open | `"close"` | Mirror — trailing half of the same early-close pattern. | +| `** foo **` — both edges whitespace | `"open"` | Convention: leading-whitespace test fires first; consistent with the wrapped-sentinel case `**\n{{ref:x}}\n**` rendering as `"open"`. | +| `**\n**` — interior is a single `\n` | `"open"` | `\n` counts as `isspace()`; both-edges-whitespace falls under the `"open"` convention. | + +Edge cases the tokenizer does not emit as emphasis at all carry `shape="neutral"` like any other TEXT: `**` alone; `**foo` unclosed; `**` paired with no inner character; `****` consumed as two TEXT runs because `(.+?)` requires at least one inner character; `__bold__` consumed as TEXT per ADR-017 D1's asterisk-only-bold stance; and **intraword underscore runs like `home_bar and foo_baz` consumed as TEXT** because the underscores fail the whitespace-flanking requirement — a snake_case identifier pair is not an italic span. + +The classification is implemented as a private helper `_classify_emphasis_shape(span, delim) -> str` in `_prose_tokens.py`, invoked from the three emphasis emission sites at [`_prose_tokens.py:421-440`](https://github.com/cosai-oasis/secure-ai-tooling/blob/7320136/scripts/hooks/precommit/_prose_tokens.py#L421-L440) (one per delimiter family). The helper is a pure function over the matched span; it takes no global state. Emit-site call shape: `emit(TokenKind.BOLD, m.group(), shape=_classify_emphasis_shape(m.group(), "**"))`. + +The shape decision is encapsulated in the tokenizer. The linter never reads whitespace edges directly; it reads `token.shape`. + +### D4. Consumer contract + +The public surface of `_prose_tokens.py` is exactly three names: `Token`, `TokenKind`, and `tokenize()`. The leading underscore in the module name marks the module as one of the precommit-sibling internals coordinated by ADR-017 D5 and ADR-016 D6; it does **not** mark every name in the module as private. Internal helpers (`_match_sentinel`, `_classify_emphasis_shape`, `emit`, `flush_text`, `pending_text_start`, the `_RE_*` regex constants) are not part of the contract and may be reorganized without an ADR amendment. + +Stable consumer API: + +- **Field access** on tokens is stable: `token.kind`, `token.value`, `token.shape`. +- **Iteration** is stable: `for token in tokens:` and indexed access (`tokens[N]`) for positional reads. +- **Positional unpacking** is **not** stable. No `kind, value = token` destructuring appears in production code or tests; future contributors must not introduce it. Tuple positionality is therefore reserved for forward-compatible field additions (D1's `shape` is the first such addition). +- **Construction** is via `tokenize()`. Test code may construct `Token(...)` via keyword arguments for assertion-side comparisons; production code outside `_prose_tokens.py` itself must not construct `Token` instances directly. + +The known production consumers are three: `validate_yaml_prose_subset.py` (the prose-subset linter; reads `kind`, `value`, and `shape` per D1); `validate_prose_references.py` (the references linter per ADR-016 D6; reads `kind` and `value` only); and `_sentinel_expansion.py` (the shared sentinel expander per ADR-016 D5; reads `kind` and `value` only). The two latter consumers are unaffected by the `shape` field's introduction because they never read it; under D1's default-`"neutral"` posture, they see tokens whose new field carries the no-op value. + +The test corpus's contract is the `_tokens_to_dicts` projection at [`test_prose_tokens.py:203`](https://github.com/cosai-oasis/secure-ai-tooling/blob/7320136/scripts/hooks/tests/test_prose_tokens.py#L203), currently `{"kind": t.kind.name, "value": t.value}`. The projection drops `shape` and remains stable. Fixtures continue to assert two-field equality. The fixture migration (Follow-up below) is **additive**: existing emphasis-bearing fixtures stay byte-identical and implicitly assert `shape="complete"` by virtue of the tokenizer emitting that value; new fixtures cover `"open"`, `"close"`, and other shape cases without disturbing the existing corpus. + +### D5. Linter algorithm — depth-counter bracket-matching pass + +The prose-subset linter's emphasis-rejection logic in `validate_yaml_prose_subset.py` is a single pass over `field.tokens` with an integer depth counter and two one-line predicates. The current helpers `_detect_nested_emphasis_indices` and `_is_emphasis_wrapped_sentinel`, along with the three `_RE_*_EARLY_CLOSE` and three `_RE_*_WRAPPED_SENTINEL` regex constants, are deleted. + +Algorithm (textbook bracket-matching shape): + +```text +depth = 0 +for token in field.tokens: + if token.kind in (BOLD, ITALIC): + if token.shape == "open": + if depth > 0: # already inside emphasis of same family + emit_diagnostic(...) + depth += 1 + elif token.shape == "close": + depth -= 1 # depth-floor enforced by tokenizer invariants + elif token.shape == "complete": + if depth > 0: # nested complete-emphasis inside open emphasis + emit_diagnostic(...) + # complete = open + close, net depth change 0 + # shape == "neutral" on emphasis kinds is not emitted (see D3) + if _is_emphasis_wrapped_sentinel(token): + emit_diagnostic(...) +``` + +The two predicates: + +- **Nested-emphasis predicate.** Fires when an emphasis token is encountered with `depth > 0` and a same-family emphasis open is unmatched. Implemented as the `depth > 0` check inline in the walk above. The kind comparison is via `token.kind` directly (a bare counter suffices; a kind-stack is not needed for the current ADR-017 D1 rule, which detects any nesting regardless of delimiter family). +- **Emphasis-wrapped-sentinel predicate.** Fires when an emphasis token's interior (its `value` minus the delimiter pair) `.strip()`s to a string that matches `SENTINEL_INNER_RE` (the unified intra-or-ref regex defined once in `_prose_tokens.py` and shared with the references linter). Independent of the depth state. + +Both predicates are one-line expressions over `token.shape` (or `token.kind` and `token.value`) and the depth state. There are no regex constants in the linter for shape detection. The whitespace-adjacency heuristic that previously lived in `_RE_*_EARLY_CLOSE` has been moved into the tokenizer's `_classify_emphasis_shape` per D3; the linter cannot see it and cannot drift from it. + +The walk handles the false-positive patterns a naive stack would faux-depth on. `**hello** world **goodbye**` tokenizes as `[BOLD shape="complete", TEXT shape="neutral", BOLD shape="complete"]` and walks depth `0 → 0 → 0 → 0`; no diagnostic. `**hello** world {{ref:x}}` tokenizes as `[BOLD shape="complete", TEXT, SENTINEL_REF]` and the sentinel is at `depth == 0`; no diagnostic. The prospective D6-of-ADR-017 rules ("no sentinel inside any emphasis", "no link text containing emphasis") would express as `depth > 0 and token is sentinel/link` — predicates with a real depth value, not a faux one. + +Reason strings (`_REASON_NESTED_EMPHASIS`, `_REASON_EMPHASIS_WRAPPED_SENTINEL`) and the diagnostic format are preserved per D6. + +### D6. Diagnostic conformance + +ADR-017 D4's diagnostic format spec is preserved byte-for-byte: + +```text +validate-yaml-prose-subset: ::[]: at +``` + +The two reason strings — `"nested emphasis"` and `"emphasis-wrapped sentinel"` — are unchanged. The nested-index `[][]` second-segment format introduced by PR #286 (per ADR-017 D4 line 81's "Addendum" amendment) is unchanged. The token-snippet is the same `token.value` substring the existing implementation emits. + +`TestDiagnosticFormat`, the live-corpus baseline, and third-party tooling that scrapes the linter's stderr output continue to work without modification. + +### D7. Cross-ADR alignment + +**In scope for the ADR-028 commit:** + +- **ADR-017 D5 amendment.** One additive paragraph at the end of D5 stating that the Token contract (NamedTuple structure including emphasis-shape field, TokenKind enumeration, emission invariants, consumer surface) is formally specified in ADR-028 D1-D4. Does not amend any existing rule; only adds the pointer. +- **ADR-017 D1 amendment.** One forward pointer at the end of D1 indicating that the canonical mechanism for enforcing the "no nested same-family emphasis" and "no emphasis-wrapped sentinel" rules is the depth-counter pass specified in ADR-028 D5. + +**One-way cross-references this ADR makes:** + +- [ADR-005](005-pre-commit-framework.md): `validate-yaml-prose-subset` is one of the local hooks declared under the pre-commit framework; ADR-005's hook-orchestration shape is preserved. +- [ADR-012](012-static-spa-architecture.md): the zero-dep posture for the SPA's runtime path is the prior art that motivated ADR-015's hand-rolled sanitizer and ADR-017's hand-rolled tokenizer. ADR-028 keeps the hand-rolled approach; no library dependency is introduced. +- [ADR-015](015-site-content-sanitization-invariants.md) D3: the bounded-emission property and the hand-rolled-sanitizer-with-grammar-sync rationale apply symmetrically to the lint-side tokenizer. Neither the emitted token kinds nor the partition-of-input invariant changes, so the site/lint grammar parity is preserved. +- [ADR-016](016-reference-strategy.md) D5 and D6: `validate_prose_references.py` and `_sentinel_expansion.py` are the two consumers of the shared tokenizer besides the prose-subset linter itself. Per D4 above, neither reads `shape`; the contract raise is transparent to them. +- [ADR-025](025-testing-strategy.md) D2 and D10: the testing chain (Testing → Code-Reviewer → SWE → Code-Reviewer) authors failing tests before implementation; the `_classify_emphasis_shape` helper and the depth-counter walk both flow through the chain. D10's wire-up requirement is satisfied by including at least one test that calls `tokenize()` and inspects `shape` on the emitted tokens (rather than only testing the classifier helper in isolation). + +**Deferred to future maintenance edits:** + +- **ADR-016 D6 amendment to mention the `shape` field's existence.** Not in scope. `validate_prose_references.py` does not read `shape`; the courtesy amendment would add scope without value. A future amendment lands when the field becomes load-bearing for the references linter. +- **ADR-020 D4 / folded-bullet citation defect.** The `INVALID_FOLDED_BULLET` kind is documented in `_prose_tokens.py` and `_REASONS` strings as belonging to ADR-020 D4, but D4 is the prose-shape decision; the folded-bullet rule's actual home is ADR-020's follow-up section. This is a pre-existing documentation defect, not load-bearing here. Cleanup lands in a separate one-line commit. + +## Alternatives Considered + +- **Status quo — regex-driven shape detection in the linter.** Three `_RE_*_EARLY_CLOSE` constants in `validate_yaml_prose_subset.py` extract the whitespace-edge signal from token values that the greedy non-greedy bold/italic regexes encoded as side effects. Rejected because the shape decision is split across two layers, the regex constants encode an implicit channel from the tokenizer to the linter, and three accreted commits (`a71ae47`, `817e00d`, `5e72a2a`) on the archived `feature/353-c4-followons` branch motivated the restructure in the first place. +- **Path B — derived methods `Token.delim()` and `Token.interior()`, shape inference at the linter via `interior().endswith(whitespace)`.** Stashed at `stash@{0}` on `.worktree/353-c4-followons`. Rejected on four grounds: (1) faux-depth — the kind-stack had no pop conditions, so `**hello** world **goodbye**` would false-positive as depth-2 nested; (2) the whitespace heuristic was renamed-not-removed — moved from a regex constant to a method-then-string-operation chain with the same load-bearing knowledge requirement; (3) D6 forward-compat predicates degenerated to "is there any prior emphasis in this string"; (4) asymmetric API — `delim()` returned `str | None` and `interior()` returned `value` unchanged for non-emphasis kinds, forcing every consumer to handle two different "doesn't apply" fallbacks. +- **Open/close event tokens — emit `BOLD_OPEN` and `BOLD_CLOSE` as distinct kinds.** Rejected because it breaks the partition-of-input invariant: an event token has no `value` substring of its own, or the `value` must be split across multiple tokens (the open's `**` and the close's `**`), neither of which the consumer audit and existing fixture format tolerate. BOLD and ITALIC remain single tokens covering the delimiter-bounded span; shape is a field on the token, not a kind. +- **Extend the test projection to include `shape` and annotate every emphasis-bearing fixture.** Rejected. The fixture corpus has 56 pairs; the migration would touch ~150 `shape` keys across files, most of them defaulting to `"neutral"` with low signal density. The existing five emphasis-bearing fixtures already implicitly lock in `shape="complete"` (because the tokenizer emits that value and the projection drops it). +- **Suppress the projection extension and assert `shape` only via per-test attribute checks, never via fixtures.** Rejected. Loses the auditability of fixture-driven shape assertions for new `"open"` / `"close"` cases. +- **Use a markdown library (`markdown-it-py`, `commonmark`).** Rejected per ADR-017 D5 line 80 and ADR-015 D3's zero-dep posture for the matching sanitizer. The grammar is small enough to hand-roll; a library brings a configuration surface larger than the replacement code. +- **Kind-stack rather than bare depth counter in the linter.** Reasonable for prospective D6-of-ADR-017 rules that would need to know the enclosing kind (e.g., "no link text containing emphasis"), but the current rule set — ADR-017 D1's "no nested same-family emphasis" plus the wrapped-sentinel check — needs only a depth value; kind comparison happens via `token.kind` directly on the emphasis token under inspection. A kind-stack adds zero current capability and complicates the algorithm's textbook shape. If a future rule needs the enclosing-kind dimension, the stack arrives then. + +## Consequences + +**Positive** + +- **One shape decision, encapsulated in the tokenizer.** The whitespace-adjacency heuristic exists in exactly one place — `_classify_emphasis_shape` — and runs on the matched span before the token is appended to the stream. The linter sees a four-value enum and dispatches on it; future contributors do not need to know that whitespace-at-edge is load-bearing to read the linter. +- **Faux-depth false positives are eliminated by construction.** `**hello** world **goodbye**` tokenizes as two `shape="complete"` BOLDs, walks depth `0 → 0`, and emits no diagnostic. Prospective D6-of-ADR-017 rules can use `depth > 0` with confidence that the depth value is structurally meaningful. +- **The Token API is symmetric.** Every token carries `shape: Literal[…]`; there is no `None` fallback to branch on. Consumers that need shape read it; consumers that do not read it (`validate_prose_references.py`, `_sentinel_expansion.py`) are unaffected by its presence. +- **The contract is ADR-grade.** The Token NamedTuple, TokenKind enumeration, tokenizer emission invariants, and consumer surface are specified here. Future refactors of the tokenizer-internal helpers (`_match_sentinel`, `_classify_emphasis_shape`, the `_RE_*` constants) need no ADR amendment; surface changes (new fields, new kinds, new emission invariants) do. +- **The linter shrinks.** Three `_RE_*_EARLY_CLOSE` constants, three `_RE_*_WRAPPED_SENTINEL` constants, `_detect_nested_emphasis_indices`, and `_is_emphasis_wrapped_sentinel` are deleted from `validate_yaml_prose_subset.py`. The replacement is a single-pass walk with two one-line predicates over `token.shape` and the depth state. +- **Diagnostic format is preserved.** The existing test corpus (including `TestDiagnosticFormat` and the live-corpus baseline) and any third-party tooling that scrapes the linter's stderr output continue to work without modification. + +**Negative** + +- **The classifier helper is now part of the contract.** `_classify_emphasis_shape` is internal to `_prose_tokens.py` per D4, but its behavior — the both-edges-whitespace `"open"` convention, the leading-whitespace-first ordering — is observable through the `shape` field on emphasis tokens. A future change to the convention is a tokenizer change; downstream tests assert against the values it emits. +- **Two BOLD tokens with different `shape` are not structurally equal.** D1's NamedTuple equality posture means test code that constructs a token manually for comparison against `tokenize()` output must supply the correct `shape` or compare via the projection that drops `shape`. The TDD chain's reviewers carry this as a checklist item. +- **The `shape` field name is generic.** A future ADR adding (e.g.) `position` or `category` to the NamedTuple may force a rename to `emphasis_shape` for clarity. The generic name is accepted on the basis that the field's semantics are documented in D1. +- **The underscore-italic tightening (D3 invariant 3) is content-visible.** Snake_case identifier pairs like `home_bar and foo_baz` no longer tokenize as italic — a corpus-observable shift even though no new rejection is added. The corpus impact probe (Follow-up below) runs before Status flips from Draft to Accepted; if surprises surface, the rule is revisited. +- **Authors who copy prose containing `_X Y_` constructs from upstream where they were intended as italic must now flank with whitespace explicitly.** The friction is intentional; the false-positive case it eliminates is the load-bearing reason. + +**Follow-up** + +- **Underscore-italic corpus diff — the lockdown validation gate.** Before maintainer flips Status: Draft → Accepted, a one-off script runs both the current `_RE_ITALIC_UNDERSCORE` and the proposed whitespace-flanked variant against all four content YAMLs (`risks.yaml`, `controls.yaml`, `components.yaml`, `personas.yaml`) and emits a diff listing every prose field whose tokenization changes (e.g., `::: was ITALIC("_X Y_"), now TEXT`). Two outcomes: diff is empty or only matches expected snake_case false positives → lock confirmed; diff includes authorially-intended italic spans → maintainer reopens the underscore decision before the ADR is Accepted. +- **Triple-asterisk tokenizer probe.** Captures the tokenizer's output for `***foo***`, `****`, `*****foo*****`, `**foo***`, `***foo**`, and `***`. Feeds the classifier test fixtures. Runs as part of the testing agent's RED phase, parallel to the SWE implementation. +- **Live-corpus regression baseline.** Captures the current zero-diagnostic state of `validate-yaml-prose-subset` against the four content YAMLs as a `pytest.mark.live_corpus` test. Gates the post-migration regression check; the new linter must produce the same zero-diagnostic result. +- **Fixture migration — additive only.** Add 6-10 new emphasis-shape fixtures under (likely) `scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/` or equivalent, covering `shape="open"`, `shape="close"`, depth-3 nested input asserted multi-token, and italic variants per delimiter. The existing five emphasis-bearing fixtures stay byte-identical; they implicitly assert `shape="complete"` via the tokenizer's emission and the projection's drop-of-`shape`. The `_tokens_to_dicts` projection at [`test_prose_tokens.py:203`](https://github.com/cosai-oasis/secure-ai-tooling/blob/7320136/scripts/hooks/tests/test_prose_tokens.py#L203) stays at `{"kind", "value"}`; tests that need to assert `shape` use per-test attribute checks. +- **Stale corpus-size comments.** The module docstring at [`_prose_tokens.py:266`](https://github.com/cosai-oasis/secure-ai-tooling/blob/7320136/scripts/hooks/precommit/_prose_tokens.py#L266) references "42 grammar cases"; the test header at [`test_prose_tokens.py:90-97`](https://github.com/cosai-oasis/secure-ai-tooling/blob/7320136/scripts/hooks/tests/test_prose_tokens.py#L90-L97) references "55 fixture-parametrized pairs". The current corpus is 56 pairs (with the additive emphasis-shape fixtures, 62-66). Cleanup lands with the fixture migration commit. +- **Future kind additions.** Adding a new TokenKind member (e.g., for a hypothetical fourth allowed authoring token) is an amendment to D2 here, plus the corresponding ADR-017 D1 amendment. The contract is ADR-grade, so future additions surface as ADR work, not implementation drift. +- **Future shape values.** Adding a fifth `shape` value (e.g., for cross-delimiter italic-in-italic detection per ADR-017 D1 line 36's deferred case) is an amendment to D1 and D3 here. The cross-delimiter case continues to tokenize as a single ITALIC token because the outer match consumes the inner. diff --git a/docs/adr/README.md b/docs/adr/README.md index dc1327ae..b0f23086 100644 --- a/docs/adr/README.md +++ b/docs/adr/README.md @@ -43,6 +43,7 @@ If a decision is about *how the Risk Map content model is shaped*, it belongs in | [025](025-testing-strategy.md) | Testing strategy and posture across Python, site JS, schemas, and infrastructure | Accepted | 2026-05-05 | | [026](026-issue-template-domain.md) | Issue-template domain — generator scope, schema-derived enums, and ADR-content alignment contract | Accepted | 2026-05-20 | | [027](027-framework-versioning-and-mapping-convention.md) | Per-mapping framework version pinning | Accepted | 2026-05-22 | +| [028](028-prose-linter-bracket-matching-architecture.md) | Prose-linter emphasis enforcement via bracket-matching depth pass over a Token-shape contract | Draft | 2026-05-27 | ## Conventions From 025e0594e1fda5763865518828a9d89c7160b03e Mon Sep 17 00:00:00 2001 From: davidlabianca Date: Thu, 28 May 2026 21:51:56 +0000 Subject: [PATCH 2/9] This commit flips ADR-028 from Draft to Accepted after Spike S3 confirms the D-Open-21 lock Spike S3 (underscore-italic corpus impact probe) ran the current and proposed whitespace-flanked _RE_ITALIC_UNDERSCORE over all 569 prose fields in the four content YAMLs: zero tokenization changes. Probe validated non-blind against the home_bar and foo_baz motivating case. D-Open-21 lock confirmed; first ADR-028 section 9.3 outcome (empty diff). Co-authored-by: AI Assistant --- docs/adr/028-prose-linter-bracket-matching-architecture.md | 6 +++--- docs/adr/README.md | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/adr/028-prose-linter-bracket-matching-architecture.md b/docs/adr/028-prose-linter-bracket-matching-architecture.md index e5c4e44f..0c596809 100644 --- a/docs/adr/028-prose-linter-bracket-matching-architecture.md +++ b/docs/adr/028-prose-linter-bracket-matching-architecture.md @@ -1,7 +1,7 @@ # ADR-028: Prose-linter emphasis enforcement via bracket-matching depth pass over a Token-shape contract -**Status:** Draft -**Date:** 2026-05-27 +**Status:** Accepted +**Date:** 2026-05-27 (Draft); 2026-05-28 (Accepted) **Authors:** Architect agent, with maintainer review --- @@ -203,7 +203,7 @@ The two reason strings — `"nested emphasis"` and `"emphasis-wrapped sentinel"` **Follow-up** -- **Underscore-italic corpus diff — the lockdown validation gate.** Before maintainer flips Status: Draft → Accepted, a one-off script runs both the current `_RE_ITALIC_UNDERSCORE` and the proposed whitespace-flanked variant against all four content YAMLs (`risks.yaml`, `controls.yaml`, `components.yaml`, `personas.yaml`) and emits a diff listing every prose field whose tokenization changes (e.g., `::: was ITALIC("_X Y_"), now TEXT`). Two outcomes: diff is empty or only matches expected snake_case false positives → lock confirmed; diff includes authorially-intended italic spans → maintainer reopens the underscore decision before the ADR is Accepted. +- **Underscore-italic corpus diff — the lockdown validation gate.** Before maintainer flips Status: Draft → Accepted, a one-off script runs both the current `_RE_ITALIC_UNDERSCORE` and the proposed whitespace-flanked variant against all four content YAMLs (`risks.yaml`, `controls.yaml`, `components.yaml`, `personas.yaml`) and emits a diff listing every prose field whose tokenization changes (e.g., `::: was ITALIC("_X Y_"), now TEXT`). Two outcomes: diff is empty or only matches expected snake_case false positives → lock confirmed; diff includes authorially-intended italic spans → maintainer reopens the underscore decision before the ADR is Accepted. **Resolved 2026-05-28:** probe run over all 569 prose fields in the four content YAMLs reported **zero** tokenization changes (first outcome). The probe was validated as non-blind — `home_bar and foo_baz` flips ITALIC→TEXT as designed while genuine whitespace-flanked italics and `__double__` are preserved. D-Open-21 lock confirmed; Status flipped Draft → Accepted. - **Triple-asterisk tokenizer probe.** Captures the tokenizer's output for `***foo***`, `****`, `*****foo*****`, `**foo***`, `***foo**`, and `***`. Feeds the classifier test fixtures. Runs as part of the testing agent's RED phase, parallel to the SWE implementation. - **Live-corpus regression baseline.** Captures the current zero-diagnostic state of `validate-yaml-prose-subset` against the four content YAMLs as a `pytest.mark.live_corpus` test. Gates the post-migration regression check; the new linter must produce the same zero-diagnostic result. - **Fixture migration — additive only.** Add 6-10 new emphasis-shape fixtures under (likely) `scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/` or equivalent, covering `shape="open"`, `shape="close"`, depth-3 nested input asserted multi-token, and italic variants per delimiter. The existing five emphasis-bearing fixtures stay byte-identical; they implicitly assert `shape="complete"` via the tokenizer's emission and the projection's drop-of-`shape`. The `_tokens_to_dicts` projection at [`test_prose_tokens.py:203`](https://github.com/cosai-oasis/secure-ai-tooling/blob/7320136/scripts/hooks/tests/test_prose_tokens.py#L203) stays at `{"kind", "value"}`; tests that need to assert `shape` use per-test attribute checks. diff --git a/docs/adr/README.md b/docs/adr/README.md index b0f23086..8a0a9ea3 100644 --- a/docs/adr/README.md +++ b/docs/adr/README.md @@ -43,7 +43,7 @@ If a decision is about *how the Risk Map content model is shaped*, it belongs in | [025](025-testing-strategy.md) | Testing strategy and posture across Python, site JS, schemas, and infrastructure | Accepted | 2026-05-05 | | [026](026-issue-template-domain.md) | Issue-template domain — generator scope, schema-derived enums, and ADR-content alignment contract | Accepted | 2026-05-20 | | [027](027-framework-versioning-and-mapping-convention.md) | Per-mapping framework version pinning | Accepted | 2026-05-22 | -| [028](028-prose-linter-bracket-matching-architecture.md) | Prose-linter emphasis enforcement via bracket-matching depth pass over a Token-shape contract | Draft | 2026-05-27 | +| [028](028-prose-linter-bracket-matching-architecture.md) | Prose-linter emphasis enforcement via bracket-matching depth pass over a Token-shape contract | Accepted | 2026-05-28 | ## Conventions From cc7df7de728030fae54bfe7d10e4406e48e5ce1c Mon Sep 17 00:00:00 2001 From: davidlabianca Date: Fri, 29 May 2026 15:54:54 +0000 Subject: [PATCH 3/9] This commit adds failing tests for the ADR-028 Token.shape contract and depth-counter prose linter (RED) Authors the RED-phase test suite for Accepted ADR-028 ahead of implementation (ADR-025 D2 TDD chain). New coverage: - TestEmphasisShapeClassification + TestTokenShapeField (test_prose_tokens.py): the Token.shape contract (D1-D3) and _classify_emphasis_shape cell-grid (D-Open-16), incl. the D-Open-7 both-edges-whitespace "open" convention and the ADR-025 D10 wire-up (shape inspected on tokenize() output). - TestIntrawordUnderscoreRejection (test_prose_tokens.py): the D-Open-21 whitespace-flanked underscore tightening (intraword \S_\S is not italic; boundary/whitespace-flanked _italic_ still is; __double__ stays TEXT). - TestNestedEmphasisRejection + TestEmphasisDiagnosticFormat (test_validate_yaml_prose_subset.py): the D5 depth-counter walk + the two predicates, with the faux-depth false-positive guard, and the D6 byte-for-byte diagnostic format / exact reason strings. - TestTripleAsteriskProbe (S1) and TestLiveCorpusBaseline (S2, live_corpus marker): the Spike S1 tokenizer ground-truth lock and the zero-diagnostic corpus regression baseline gating the GREEN pass. 35 new tests fail for the right reason (missing Token.shape field, absent _classify_emphasis_shape, current intraword-underscore match, and the not-yet- built emphasis rejection in check_prose_field); the pre-existing tokenizer and prose-subset suites stay green and unmodified. Co-authored-by: AI Assistant --- pyproject.toml | 1 + scripts/hooks/tests/test_prose_tokens.py | 627 ++++++++++++++++++ .../tests/test_validate_yaml_prose_subset.py | 484 ++++++++++++++ 3 files changed, 1112 insertions(+) diff --git a/pyproject.toml b/pyproject.toml index 17b7a1eb..ecfc3239 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -2,6 +2,7 @@ pythonpath = ["scripts/hooks"] markers = [ "slow: marks tests as slow (deselect with '-m \"not slow\"')", + "live_corpus: marks tests that read the live risk-map YAML corpus (deselect with '-m \"not live_corpus\"')", ] [tool.coverage.run] diff --git a/scripts/hooks/tests/test_prose_tokens.py b/scripts/hooks/tests/test_prose_tokens.py index e872253b..4e6bb805 100644 --- a/scripts/hooks/tests/test_prose_tokens.py +++ b/scripts/hooks/tests/test_prose_tokens.py @@ -1932,3 +1932,630 @@ def test_newline_only(self): assert not token.kind.name.startswith("INVALID"), ( f"Bare newline should not produce INVALID token, got {token!r}" ) + + +# =========================================================================== +# TestTripleAsteriskProbe (Spike S1 — descriptive locks, GREEN now) +# =========================================================================== +# These tests lock the ground-truth token streams for six triple-asterisk +# inputs. They describe what the CURRENT tokenizer emits (before the SWE +# pass), so they are green now and must remain green after the SWE pass +# (the tokenizer's emphasis emission sites are being extended, not changed). +# Feeds into TestEmphasisShapeClassification fixture grounding. +# =========================================================================== + + +class TestTripleAsteriskProbe: + r""" + Lock current tokenizer output for triple-/multi-asterisk edge-case inputs. + + ADR-028 §9.3 Spike S1 — run empirically before classifier tests are written + to ensure the _classify_emphasis_shape helper handles each case consistently. + + All assertions are descriptive: they record the observed stream. They are + GREEN against the current tokenizer and must stay GREEN after the SWE pass. + """ + + def test_triple_star_foo_tokenizes_as_bold_plus_trailing_text(self): + """ + Given: '***foo***' + When: tokenize() is called + Then: BOLD('***foo**') + TEXT('*') + + The non-greedy _RE_BOLD closes at the first '**' after open, consuming + '***foo**' (interior='*foo'). The final lone '*' is TEXT. + Shape (post-SWE): the BOLD interior '*foo' has no edge whitespace -> + _classify_emphasis_shape yields 'complete'. + """ + _require_module() + tokens = tokenize("***foo***") + assert len(tokens) == 2, f"Expected 2 tokens, got {tokens!r}" + assert tokens[0].kind == TokenKind.BOLD + assert tokens[0].value == "***foo**" + assert tokens[1].kind == TokenKind.TEXT + assert tokens[1].value == "*" + + def test_four_stars_tokenizes_as_italic_plus_trailing_text(self): + """ + Given: '****' + When: tokenize() is called + Then: ITALIC('***') + TEXT('*') + + Rule 12 (bold) checks ch=='*' and text[i+1]=='*', tries _RE_BOLD which + needs at least one inner char between '**...**'. '****' has interior='' + which fails (.+?). Rule 13 (italic asterisk) then fires on the lone '*' + and _RE_ITALIC_ASTERISK matches '***' (interior='*'). The remaining '*' + is TEXT. + Shape (post-SWE): ITALIC interior '*' has no edge whitespace -> 'complete'. + """ + _require_module() + tokens = tokenize("****") + assert len(tokens) == 2, f"Expected 2 tokens, got {tokens!r}" + assert tokens[0].kind == TokenKind.ITALIC + assert tokens[0].value == "***" + assert tokens[1].kind == TokenKind.TEXT + assert tokens[1].value == "*" + + def test_five_star_foo_tokenizes_as_bold_text_bold(self): + """ + Given: '*****foo*****' + When: tokenize() is called + Then: BOLD('*****') + TEXT('foo') + BOLD('*****') + + _RE_BOLD non-greedy: matches '**' open, then (.+?) closes at first '**'. + Given '*****foo*****', opening at i=0, the first '**' close is at i=3 + (positions 3-4), consuming '*****' (interior='***'). TEXT('foo'). + Then BOLD('*****') again. + Shape (post-SWE): interior '***' has no edge whitespace -> 'complete'. + """ + _require_module() + tokens = tokenize("*****foo*****") + assert len(tokens) == 3, f"Expected 3 tokens, got {tokens!r}" + assert tokens[0].kind == TokenKind.BOLD + assert tokens[0].value == "*****" + assert tokens[1].kind == TokenKind.TEXT + assert tokens[1].value == "foo" + assert tokens[2].kind == TokenKind.BOLD + assert tokens[2].value == "*****" + + def test_bold_then_one_star_tokenizes_as_bold_plus_text(self): + """ + Given: '**foo***' + When: tokenize() is called + Then: BOLD('**foo**') + TEXT('*') + + Standard bold match; trailing lone '*' is TEXT. + Shape (post-SWE): interior 'foo' has no edge whitespace -> 'complete'. + """ + _require_module() + tokens = tokenize("**foo***") + assert len(tokens) == 2, f"Expected 2 tokens, got {tokens!r}" + assert tokens[0].kind == TokenKind.BOLD + assert tokens[0].value == "**foo**" + assert tokens[1].kind == TokenKind.TEXT + assert tokens[1].value == "*" + + def test_one_star_then_bold_tokenizes_as_single_bold(self): + """ + Given: '***foo**' + When: tokenize() is called + Then: BOLD('***foo**') — single token + + _RE_BOLD opens at '**' (positions 0-1), (.+?) matches '*foo' (leading + '*' is inner content), closes at '**' (positions 5-6). Full span is + '***foo**'. Interior is '*foo'; no edge whitespace -> 'complete'. + """ + _require_module() + tokens = tokenize("***foo**") + assert len(tokens) == 1, f"Expected 1 token, got {tokens!r}" + assert tokens[0].kind == TokenKind.BOLD + assert tokens[0].value == "***foo**" + + def test_three_stars_alone_tokenizes_as_italic(self): + """ + Given: '***' + When: tokenize() is called + Then: ITALIC('***') — single token + + Bold rule fails (no closing '**' after inner content). Italic-asterisk + rule succeeds: _RE_ITALIC_ASTERISK matches '*...*' = '***' (interior='*'). + Shape (post-SWE): interior '*' has no edge whitespace -> 'complete'. + """ + _require_module() + tokens = tokenize("***") + assert len(tokens) == 1, f"Expected 1 token, got {tokens!r}" + assert tokens[0].kind == TokenKind.ITALIC + assert tokens[0].value == "***" + + +# =========================================================================== +# TestEmphasisShapeClassification (RED — _classify_emphasis_shape not yet) +# =========================================================================== +# These tests import _classify_emphasis_shape from precommit._prose_tokens. +# The helper does not exist until the SWE pass, so the import block below +# uses the same lazy-guard pattern as the module's top-level _IMPORT_ERROR +# guard: a collection-time import failure sets _CLASSIFY_IMPORT_ERROR and +# each test calls _require_classify() to fail with an assertion error rather +# than a collection crash. +# =========================================================================== + +_CLASSIFY_IMPORT_ERROR: ImportError | None = None +try: + from precommit._prose_tokens import _classify_emphasis_shape # noqa: E402 +except ImportError as _ce: + _CLASSIFY_IMPORT_ERROR = _ce + _classify_emphasis_shape = None # type: ignore[assignment] + + +def _require_classify() -> None: + """Fail with assertion if _classify_emphasis_shape could not be imported.""" + if _CLASSIFY_IMPORT_ERROR is not None: + pytest.fail( + f"_classify_emphasis_shape not yet importable from precommit._prose_tokens.\n" + f"Original error: {_CLASSIFY_IMPORT_ERROR}\n" + "This test is RED until the SWE pass adds the helper." + ) + + +class TestEmphasisShapeClassification: + r""" + Cell-grid tests for _classify_emphasis_shape(span, delim) -> str. + + ADR-028 D3 shape rules, D-Open-16 (12-14 cases), D-Open-7 (both-edges -> 'open'). + + Grid: delimiter {'**', '*', '_'} x shape {'complete', 'open', 'close'} + plus edge cases from Spike S1 (triple-asterisk inputs). All tests are RED + until the SWE pass adds _classify_emphasis_shape to _prose_tokens.py. + + No `neutral`-return case is tested because the classifier only runs on + matched emphasis spans; non-emphasis tokens carry shape='neutral' by + construction in the tokenizer (not via the classifier). + + Shape rules (from ADR-028 D3 table): + - Neither edge has whitespace in interior -> 'complete' + - Only trailing whitespace in interior -> 'open' + - Only leading whitespace in interior -> 'close' + - Both edges whitespace -> 'open' (convention, D-Open-7) + - Single whitespace-only interior (\n) -> 'open' + """ + + # --- BOLD '**' delimiter --- + + def test_bold_complete_no_edge_whitespace(self): + """ + Given: span='**foo**', delim='**' + When: _classify_emphasis_shape is called + Then: returns 'complete' + + Interior is 'foo'; neither first nor last char is whitespace. + """ + _require_classify() + assert _classify_emphasis_shape("**foo**", "**") == "complete" + + def test_bold_complete_with_internal_space_not_at_edge(self): + """ + Given: span='**foo bar**', delim='**' + When: _classify_emphasis_shape is called + Then: returns 'complete' + + Interior is 'foo bar'; edge chars are 'f' and 'r' — neither is whitespace. + Only edge whitespace is the signal (ADR-028 D3). + """ + _require_classify() + assert _classify_emphasis_shape("**foo bar**", "**") == "complete" + + def test_bold_open_trailing_whitespace(self): + """ + Given: span='**foo **', delim='**' + When: _classify_emphasis_shape is called + Then: returns 'open' + + Interior is 'foo '; last char is space -> trailing edge whitespace -> + the non-greedy regex closed on what the author intended as an inner open. + """ + _require_classify() + assert _classify_emphasis_shape("**foo **", "**") == "open" + + def test_bold_close_leading_whitespace(self): + """ + Given: span='** bar**', delim='**' + When: _classify_emphasis_shape is called + Then: returns 'close' + + Interior is ' bar'; first char is space -> leading edge whitespace -> + this is the trailing half of an early-closed greedy match. + """ + _require_classify() + assert _classify_emphasis_shape("** bar**", "**") == "close" + + def test_bold_both_edges_whitespace_is_open(self): + """ + Given: span='** foo **', delim='**' + When: _classify_emphasis_shape is called + Then: returns 'open' + + Interior is ' foo '; both edges whitespace. D-Open-7 convention: + leading-whitespace test fires first -> 'open'. Consistent with the + wrapped-sentinel case '**\\n{{ref:x}}\\n**' rendering as 'open'. + """ + _require_classify() + assert _classify_emphasis_shape("** foo **", "**") == "open" + + def test_bold_newline_interior_is_open(self): + """ + Given: span='**\\n**', delim='**' + When: _classify_emphasis_shape is called + Then: returns 'open' + + Interior is '\\n'; \\n.isspace() is True -> both-edges convention + fires (both leading and trailing are the same whitespace char) -> 'open'. + ADR-028 D3 table: '**\\n**' -> 'open'. + """ + _require_classify() + assert _classify_emphasis_shape("**\n**", "**") == "open" + + # --- ITALIC asterisk '*' delimiter --- + + def test_italic_asterisk_complete(self): + """ + Given: span='*foo*', delim='*' + When: _classify_emphasis_shape is called + Then: returns 'complete' + """ + _require_classify() + assert _classify_emphasis_shape("*foo*", "*") == "complete" + + def test_italic_asterisk_open_trailing_space(self): + """ + Given: span='*foo *', delim='*' + When: _classify_emphasis_shape is called + Then: returns 'open' + """ + _require_classify() + assert _classify_emphasis_shape("*foo *", "*") == "open" + + def test_italic_asterisk_close_leading_space(self): + """ + Given: span='* bar*', delim='*' + When: _classify_emphasis_shape is called + Then: returns 'close' + """ + _require_classify() + assert _classify_emphasis_shape("* bar*", "*") == "close" + + # --- ITALIC underscore '_' delimiter --- + + def test_italic_underscore_complete(self): + """ + Given: span='_foo_', delim='_' + When: _classify_emphasis_shape is called + Then: returns 'complete' + """ + _require_classify() + assert _classify_emphasis_shape("_foo_", "_") == "complete" + + def test_italic_underscore_open_trailing_space(self): + """ + Given: span='_foo _', delim='_' + When: _classify_emphasis_shape is called + Then: returns 'open' + """ + _require_classify() + assert _classify_emphasis_shape("_foo _", "_") == "open" + + def test_italic_underscore_close_leading_space(self): + """ + Given: span='_ bar_', delim='_' + When: _classify_emphasis_shape is called + Then: returns 'close' + """ + _require_classify() + assert _classify_emphasis_shape("_ bar_", "_") == "close" + + # --- Triple-asterisk edge cases (grounded by Spike S1) --- + + def test_triple_star_bold_interior_no_edge_ws_is_complete(self): + """ + Given: span='***foo**', delim='**' (from '***foo***' S1 probe) + When: _classify_emphasis_shape is called + Then: returns 'complete' + + Interior is '*foo'; first char '*' is not whitespace, last char 'o' + is not whitespace -> 'complete'. + """ + _require_classify() + assert _classify_emphasis_shape("***foo**", "**") == "complete" + + def test_five_star_bold_pure_star_interior_is_complete(self): + """ + Given: span='*****', delim='**' (from '*****foo*****' S1 probe) + When: _classify_emphasis_shape is called + Then: returns 'complete' + + Interior is '***'; '*' is not whitespace -> 'complete'. + """ + _require_classify() + assert _classify_emphasis_shape("*****", "**") == "complete" + + +# =========================================================================== +# TestTokenShapeField (RED — Token.shape field not yet on NamedTuple) +# =========================================================================== +# ADR-028 D1: Token gains shape: Literal['complete','open','close','neutral'] +# as third field with default 'neutral'. These tests call tokenize() and +# inspect .shape on the resulting tokens, satisfying ADR-025 D10 wire-up. +# They fail now (AttributeError: Token has no attribute 'shape'). +# =========================================================================== + + +class TestTokenShapeField: + r""" + Wire-up tests: tokenize() emits tokens whose .shape field carries the + ADR-028 D3 classification. Satisfies ADR-025 D10 (at least one test + that calls tokenize() and reads .shape on the result). + + All tests are RED until the SWE pass adds Token.shape. + """ + + def _assert_has_shape(self, token: object) -> None: + """Assert token has a .shape attribute; fail with diagnostic if not.""" + assert hasattr(token, "shape"), ( + f"Token {token!r} has no .shape attribute. RED: this test passes after the SWE pass adds Token.shape." + ) + + def test_simple_bold_has_complete_shape(self): + """ + Given: tokenize('**foo**') + When: .shape is read on the first token + Then: shape == 'complete' + + ADR-028 D3: interior 'foo' has no edge whitespace -> 'complete'. + """ + _require_module() + tokens = tokenize("**foo**") + assert len(tokens) == 1 + self._assert_has_shape(tokens[0]) + assert tokens[0].shape == "complete", f"Expected shape='complete' for '**foo**', got {tokens[0].shape!r}" + + def test_simple_italic_asterisk_has_complete_shape(self): + """ + Given: tokenize('*foo*') + When: .shape is read on the first token + Then: shape == 'complete' + """ + _require_module() + tokens = tokenize("*foo*") + assert len(tokens) == 1 + self._assert_has_shape(tokens[0]) + assert tokens[0].shape == "complete" + + def test_simple_italic_underscore_has_complete_shape(self): + """ + Given: tokenize(' _foo_ ') (whitespace-flanked for current regex to match) + When: .shape is read on the ITALIC token + Then: shape == 'complete' + """ + _require_module() + tokens = tokenize(" _foo_ ") + italic_tokens = [t for t in tokens if t.kind == TokenKind.ITALIC] + assert len(italic_tokens) == 1, f"Expected one ITALIC token, got {tokens!r}" + self._assert_has_shape(italic_tokens[0]) + assert italic_tokens[0].shape == "complete" + + def test_nested_bold_shapes_are_open_neutral_close(self): + """ + Given: tokenize('**foo **nested** bar**') + When: .shape is read on the 3 resulting tokens + Then: shapes are ['open', 'neutral', 'close'] + + The three tokens are BOLD('**foo **'), TEXT('nested'), BOLD('** bar**'). + BOLD('**foo **'): interior 'foo ' has trailing whitespace -> 'open'. + TEXT('nested'): non-emphasis token -> 'neutral'. + BOLD('** bar**'): interior ' bar' has leading whitespace -> 'close'. + """ + _require_module() + tokens = tokenize("**foo **nested** bar**") + assert len(tokens) == 3 + for t in tokens: + self._assert_has_shape(t) + assert tokens[0].shape == "open", f"Expected 'open', got {tokens[0].shape!r}" + assert tokens[1].shape == "neutral", f"Expected 'neutral', got {tokens[1].shape!r}" + assert tokens[2].shape == "close", f"Expected 'close', got {tokens[2].shape!r}" + + def test_all_non_emphasis_tokens_carry_neutral_shape(self): + """ + Given: tokenize('hello {{riskFoo}} world') + When: .shape is read on every token + Then: every non-BOLD/ITALIC token has shape == 'neutral' + + ADR-028 D3 invariant 4: non-emphasis tokens carry shape='neutral'. + """ + _require_module() + tokens = tokenize("hello {{riskFoo}} world") + for t in tokens: + self._assert_has_shape(t) + if t.kind not in (TokenKind.BOLD, TokenKind.ITALIC): + assert t.shape == "neutral", f"Non-emphasis token {t!r} must have shape='neutral', got {t.shape!r}" + + def test_invalid_token_carries_neutral_shape(self): + """ + Given: tokenize('See https://example.com') + When: .shape is read on the INVALID_URL token + Then: shape == 'neutral' + + All INVALID_* tokens are non-emphasis -> shape='neutral'. + """ + _require_module() + tokens = tokenize("See https://example.com") + url_tokens = [t for t in tokens if t.kind == TokenKind.INVALID_URL] + assert len(url_tokens) >= 1 + for t in url_tokens: + self._assert_has_shape(t) + assert t.shape == "neutral", f"INVALID_URL must have shape='neutral', got {t.shape!r}" + + def test_sentinel_token_carries_neutral_shape(self): + """ + Given: tokenize('{{riskPromptInjection}}') + When: .shape is read on the SENTINEL_INTRA token + Then: shape == 'neutral' + """ + _require_module() + tokens = tokenize("{{riskPromptInjection}}") + assert len(tokens) == 1 + assert tokens[0].kind == TokenKind.SENTINEL_INTRA + self._assert_has_shape(tokens[0]) + assert tokens[0].shape == "neutral" + + def test_text_token_carries_neutral_shape(self): + """ + Given: tokenize('plain prose text') + When: .shape is read on the TEXT token + Then: shape == 'neutral' + """ + _require_module() + tokens = tokenize("plain prose text") + assert len(tokens) == 1 + assert tokens[0].kind == TokenKind.TEXT + self._assert_has_shape(tokens[0]) + assert tokens[0].shape == "neutral" + + def test_both_edges_whitespace_bold_has_open_shape(self): + """ + Given: tokenize('** foo **') + When: .shape is read on the BOLD token + Then: shape == 'open' + + D-Open-7: both-edges-whitespace -> 'open' by convention. + """ + _require_module() + tokens = tokenize("** foo **") + bold_tokens = [t for t in tokens if t.kind == TokenKind.BOLD] + assert len(bold_tokens) == 1, f"Expected 1 BOLD, got {tokens!r}" + self._assert_has_shape(bold_tokens[0]) + assert bold_tokens[0].shape == "open", f"D-Open-7: both-edges-ws -> 'open', got {bold_tokens[0].shape!r}" + + +# =========================================================================== +# TestIntrawordUnderscoreRejection (RED — D-Open-21, regex not yet tightened) +# =========================================================================== +# ADR-028 D3 invariant 3: intraword \S_\S does NOT qualify as an italic +# delimiter. The current _RE_ITALIC_UNDERSCORE uses only adjacent-underscore +# negative lookahead (so '__' is excluded) but does NOT exclude non-whitespace +# flanking. After the SWE pass tightens the regex, these tests go GREEN. +# =========================================================================== + + +class TestIntrawordUnderscoreRejection: + r""" + Tests for D-Open-21: tightened _RE_ITALIC_UNDERSCORE. + + After the SWE pass, the regex requires whitespace-or-boundary flanking on + the opening '_' (left side) and closing '_' (right side). Intraword + underscore pairs like 'home_bar and foo_baz' must NOT tokenize as ITALIC. + + Tests that assert NO ITALIC are RED now (current regex produces ITALIC). + Tests that assert ITALIC remains for whitespace-flanked forms are GREEN now. + """ + + def test_intraword_underscore_pair_produces_no_italic(self): + """ + Given: 'home_bar and foo_baz' + When: tokenize() is called (after SWE tightens _RE_ITALIC_UNDERSCORE) + Then: NO ITALIC token in the stream + + Current behavior (RED): tokenizer matches '_bar and foo_' as ITALIC. + Expected behavior (GREEN after SWE): all tokens are TEXT. + """ + _require_module() + tokens = tokenize("home_bar and foo_baz") + italic_tokens = [t for t in tokens if t.kind == TokenKind.ITALIC] + assert len(italic_tokens) == 0, ( + f"D-Open-21: intraword '_' must NOT produce ITALIC. " + f"Got ITALIC tokens: {italic_tokens!r}. " + "RED until SWE tightens _RE_ITALIC_UNDERSCORE." + ) + + def test_adjacent_intraword_underscore_produces_no_italic(self): + """ + Given: 'a_b_c' (both underscores intraword) + When: tokenize() is called (after SWE tightens _RE_ITALIC_UNDERSCORE) + Then: NO ITALIC token — entire string is TEXT + + Current behavior (RED): '_b_' is emitted as ITALIC. + """ + _require_module() + tokens = tokenize("a_b_c") + italic_tokens = [t for t in tokens if t.kind == TokenKind.ITALIC] + assert len(italic_tokens) == 0, ( + f"D-Open-21: 'a_b_c' must produce no ITALIC. Got: {italic_tokens!r}. RED until SWE tightens regex." + ) + + def test_whitespace_flanked_underscore_italic_still_tokenizes(self): + """ + Given: 'prefix _italic_ suffix' (whitespace on both sides) + When: tokenize() is called + Then: ITALIC('_italic_') is still produced + + This case is GREEN now and must remain GREEN after the SWE pass. + Whitespace flanking is the canonical permitted form. + """ + _require_module() + tokens = tokenize("prefix _italic_ suffix") + italic_tokens = [t for t in tokens if t.kind == TokenKind.ITALIC] + assert len(italic_tokens) == 1, f"Expected 1 ITALIC for whitespace-flanked '_italic_', got {tokens!r}" + assert italic_tokens[0].value == "_italic_" + + def test_string_boundary_flanked_underscore_italic_tokenizes(self): + """ + Given: '_foo_' (start/end-of-string flanking) + When: tokenize() is called + Then: ITALIC('_foo_') is produced + + Start-of-string counts as whitespace-or-boundary flanking per + D-Open-21.2(a) cosai-simpler form. GREEN now and after SWE. + """ + _require_module() + tokens = tokenize("_foo_") + assert len(tokens) == 1, f"Expected 1 token, got {tokens!r}" + assert tokens[0].kind == TokenKind.ITALIC + assert tokens[0].value == "_foo_" + + def test_end_of_string_close_flanked_underscore_italic_tokenizes(self): + """ + Given: 'some text _foo_' (end-of-string after close underscore) + When: tokenize() is called + Then: ITALIC('_foo_') is produced + + D-Open-21.2(a): end-of-string after the closing '_' qualifies as a + word boundary and therefore satisfies the whitespace-or-boundary + flanking requirement on the close side. This is GREEN now (the current + tokenizer produces the ITALIC token) and must stay GREEN after the SWE + underscore-tightening pass. + + Empirically verified (2026-05-29): + tokenize('some text _foo_') -> + TEXT('some text ') + ITALIC('_foo_') + """ + _require_module() + tokens = tokenize("some text _foo_") + italic_tokens = [t for t in tokens if t.kind == TokenKind.ITALIC] + assert len(italic_tokens) == 1, ( + f"Expected 1 ITALIC token for end-of-string close-flanked '_foo_', got {tokens!r}" + ) + assert italic_tokens[0].value == "_foo_", f"Expected ITALIC value '_foo_', got {italic_tokens[0].value!r}" + + def test_double_underscore_still_produces_no_italic_or_bold(self): + """ + Given: '__double__' + When: tokenize() is called + Then: no BOLD and no ITALIC token + + ADR-017 D1: __bold__ is NOT recognized. This must hold before and + after the SWE pass. GREEN now. + """ + _require_module() + tokens = tokenize("__double__") + assert all(t.kind not in (TokenKind.BOLD, TokenKind.ITALIC) for t in tokens), ( + f"'__double__' must produce neither BOLD nor ITALIC. Got: {tokens!r}" + ) diff --git a/scripts/hooks/tests/test_validate_yaml_prose_subset.py b/scripts/hooks/tests/test_validate_yaml_prose_subset.py index d305a0bb..c03a8ec3 100644 --- a/scripts/hooks/tests/test_validate_yaml_prose_subset.py +++ b/scripts/hooks/tests/test_validate_yaml_prose_subset.py @@ -2023,3 +2023,487 @@ def test_flat_array_diagnostic_emits_single_bracket_only(self, tmp_path, capsys) assert not re.search(r"shortDescription\[\d+\]\[\d+\]:", line), ( f"Flat-array line must not contain double brackets: {line!r}" ) + + +# =========================================================================== +# TestLiveCorpusBaseline (Spike S2 — GREEN now, must stay GREEN after SWE) +# =========================================================================== + + +@pytest.mark.live_corpus +class TestLiveCorpusBaseline: + r""" + Spike S2: live-corpus regression baseline for the prose-subset linter. + + Asserts that the current linter produces ZERO diagnostics across the four + content YAMLs in --block mode. This test is GREEN now and must remain + GREEN after the SWE pass (the new emphasis-rejection rules must not flag + anything in the corpus — confirmed by the Spike S3 probe before ADR-028 + was flipped to Accepted). + + ADR-028 §9.3 Spike S2 — gates Phase 5 regression check. + """ + + _REPO_ROOT = Path(__file__).resolve().parent.parent.parent.parent + _YAML_DIR = _REPO_ROOT / "risk-map" / "yaml" + _CONTENT_YAMLS = ["risks.yaml", "controls.yaml", "components.yaml", "personas.yaml"] + + def _run_block(self, yaml_file: str) -> "subprocess.CompletedProcess[str]": + """Run the linter in --block mode on a single content YAML.""" + import subprocess as _sp + + script = Path(__file__).parent.parent / "precommit" / "validate_yaml_prose_subset.py" + return _sp.run( + [sys.executable, str(script), "--block", str(self._YAML_DIR / yaml_file)], + capture_output=True, + text=True, + ) + + def test_risks_yaml_produces_zero_diagnostics(self): + """ + Given: risk-map/yaml/risks.yaml (current corpus) + When: validate_yaml_prose_subset --block is run + Then: exits 0 (zero diagnostics) + + Baseline captured 2026-05-28. Must hold after SWE pass. + """ + result = self._run_block("risks.yaml") + assert result.returncode == 0, f"risks.yaml produced diagnostics:\n{result.stderr}" + assert result.stderr.strip() == "", f"risks.yaml produced unexpected stderr:\n{result.stderr}" + + def test_controls_yaml_produces_zero_diagnostics(self): + """ + Given: risk-map/yaml/controls.yaml (current corpus) + When: validate_yaml_prose_subset --block is run + Then: exits 0 (zero diagnostics) + """ + result = self._run_block("controls.yaml") + assert result.returncode == 0, f"controls.yaml produced diagnostics:\n{result.stderr}" + assert result.stderr.strip() == "", f"controls.yaml produced unexpected stderr:\n{result.stderr}" + + def test_components_yaml_produces_zero_diagnostics(self): + """ + Given: risk-map/yaml/components.yaml (current corpus) + When: validate_yaml_prose_subset --block is run + Then: exits 0 (zero diagnostics) + """ + result = self._run_block("components.yaml") + assert result.returncode == 0, f"components.yaml produced diagnostics:\n{result.stderr}" + assert result.stderr.strip() == "", f"components.yaml produced unexpected stderr:\n{result.stderr}" + + def test_personas_yaml_produces_zero_diagnostics(self): + """ + Given: risk-map/yaml/personas.yaml (current corpus) + When: validate_yaml_prose_subset --block is run + Then: exits 0 (zero diagnostics) + """ + result = self._run_block("personas.yaml") + assert result.returncode == 0, f"personas.yaml produced diagnostics:\n{result.stderr}" + assert result.stderr.strip() == "", f"personas.yaml produced unexpected stderr:\n{result.stderr}" + + +# =========================================================================== +# TestNestedEmphasisRejection (RED — depth-counter linter not yet) +# =========================================================================== +# These tests assert the ADR-028 D5 depth-counter walk. check_prose_field() +# currently has no emphasis logic, so all these produce zero diagnostics now. +# After the SWE pass they go GREEN. +# =========================================================================== + + +class TestNestedEmphasisRejection: + r""" + Tests for ADR-028 D5: depth-counter emphasis-rejection walk in check_prose_field. + + Uses the same _make_field() idiom as TestSingleViolationDetection to build + synthetic ProseField objects and call check_prose_field() directly. + + All tests that assert a diagnostic are RED until the SWE pass adds the + depth-counter walk. Tests that assert zero diagnostics are GREEN now and + must stay GREEN (false-positive guard). + """ + + def _make_field( + self, + raw_text: str, + entry_id: str = "riskAlpha", + field_name: str = "shortDescription", + index: int = 0, + ) -> "ProseField": + """Build a ProseField with tokens populated from the tokenizer.""" + import sys as _sys + from pathlib import Path as _Path + + _sys.path.insert(0, str(_Path(__file__).parent.parent / "precommit")) + from precommit._prose_tokens import tokenize as _tok # noqa: PLC0415 + + tokens = _tok(raw_text) + return ProseField( + file_path=_Path("test.yaml"), + entry_id=entry_id, + field_name=field_name, + index=index, + raw_text=raw_text, + tokens=tokens, + ) + + # --- Tests that MUST produce a diagnostic (RED until SWE pass) --- + + def test_nested_bold_produces_one_nested_emphasis_diagnostic(self): + """ + Given: '**foo **nested** bar**' -> [BOLD(open), TEXT, BOLD(close)] + When: check_prose_field is called + Then: exactly ONE diagnostic with reason containing 'nested emphasis' + and snippet "at '** bar**'" (the close token) + + ADR-028 D5: BOLD('**foo **') has shape='open' -> depth 0->1. + BOLD('** bar**') has shape='close' and arrives at depth==1 -> nested emphasis. + + Implementation note — close branch emits the diagnostic: + The D5 pseudocode's close branch shows only `depth -= 1` with no + emit_diagnostic. However, BOLD('** bar**') is the ONLY token in the + stream [open, text, close] that arrives at depth > 0 (depth==1 before + the decrement). The correct implementation MUST emit the diagnostic in + the close branch when depth > 0 (checking before decrementing). + ADR-017 D1 requires nested bold to be rejected; THIS TEST is the + authoritative outcome spec. The snippet in the reason is the close + token's value: "at '** bar**'". + + RED: check_prose_field has no emphasis logic yet. + """ + field = self._make_field("**foo **nested** bar**") + diags = check_prose_field(field) + assert len(diags) == 1, ( + f"Expected 1 'nested emphasis' diagnostic, got {len(diags)}: {diags!r}. " + "RED until SWE adds depth-counter walk." + ) + assert "nested emphasis" in diags[0].reason, ( + f"Expected reason containing 'nested emphasis', got {diags[0].reason!r}" + ) + assert "at '** bar**'" in diags[0].reason, ( + f"Expected close-token snippet \"at '** bar**'\" in reason, got {diags[0].reason!r}" + ) + + def test_nested_italic_produces_one_nested_emphasis_diagnostic(self): + """ + Given: '*foo *nested* bar*' -> [ITALIC(open), TEXT, ITALIC(close)] + When: check_prose_field is called + Then: ONE diagnostic with reason containing 'nested emphasis' + + Same depth-counter logic for italic-asterisk delimiter. + RED until SWE pass. + """ + field = self._make_field("*foo *nested* bar*") + diags = check_prose_field(field) + assert len(diags) == 1, ( + f"Expected 1 diagnostic for nested italic, got {len(diags)}: {diags!r}. " + "RED until SWE adds depth-counter walk." + ) + assert "nested emphasis" in diags[0].reason + + def test_italic_after_open_bold_produces_nested_emphasis_diagnostic(self): + """ + Given: '**A ** *B* C**' -> [BOLD(open='**A **'), TEXT(' '), ITALIC(complete='*B*'), TEXT(' C**')] + When: check_prose_field is called + Then: exactly ONE diagnostic with reason containing 'nested emphasis' + + This test covers the 'complete at depth > 0' branch of ADR-028 D5. + + Empirically verified token stream (2026-05-29): + tokenize('**A ** *B* C**') -> + BOLD('**A **') shape='open' -> depth 0->1 + TEXT(' ') shape='neutral' + ITALIC('*B*') shape='complete' -> depth==1 -> emit_diagnostic + TEXT(' C**') shape='neutral' + + The reviewer's suggested string '**A ** *B* C**' was verified to yield + this exact stream. BOLD('**A **') has trailing interior whitespace + ('A ') -> shape='open'; ITALIC('*B*') is a complete-shape token that + arrives at depth==1 after the open bold. The diagnostic fires on the + ITALIC token because it is a complete-emphasis token inside an open span. + + RED until the SWE pass adds the depth-counter walk. + """ + field = self._make_field("**A ** *B* C**") + diags = check_prose_field(field) + assert len(diags) == 1, ( + f"Expected 1 'nested emphasis' diagnostic for complete italic at depth>0, " + f"got {len(diags)}: {diags!r}. RED until SWE adds depth-counter walk." + ) + assert "nested emphasis" in diags[0].reason, ( + f"Expected reason containing 'nested emphasis', got {diags[0].reason!r}" + ) + + def test_emphasis_wrapped_sentinel_intra_produces_diagnostic(self): + """ + Given: '**{{riskPromptInjection}}**' + When: check_prose_field is called + Then: ONE diagnostic with reason containing 'emphasis-wrapped sentinel' + + ADR-028 D5: emphasis-wrapped-sentinel predicate fires when emphasis + token interior (stripped) fullmatches the sentinel inner regex. + RED until SWE pass. + """ + field = self._make_field("**{{riskPromptInjection}}**") + diags = check_prose_field(field) + assert len(diags) == 1, ( + f"Expected 1 'emphasis-wrapped sentinel' diagnostic, got {len(diags)}: {diags!r}. " + "RED until SWE adds emphasis-wrapped-sentinel predicate." + ) + assert "emphasis-wrapped sentinel" in diags[0].reason, ( + f"Expected 'emphasis-wrapped sentinel' in reason, got {diags[0].reason!r}" + ) + + def test_emphasis_wrapped_ref_sentinel_produces_diagnostic(self): + """ + Given: '**{{ref:x}}**' + When: check_prose_field is called + Then: ONE diagnostic with reason containing 'emphasis-wrapped sentinel' + + The wrapped-sentinel predicate applies to both SENTINEL_INTRA and + SENTINEL_REF inner forms — the test strips the delimiter pair and + fullmatches the unified SENTINEL_INNER_RE. + RED until SWE pass. + """ + field = self._make_field("**{{ref:x}}**") + diags = check_prose_field(field) + assert len(diags) == 1, ( + f"Expected 1 diagnostic for '**{{ref:x}}**', got {len(diags)}: {diags!r}. RED until SWE pass." + ) + assert "emphasis-wrapped sentinel" in diags[0].reason + + def test_emphasis_wrapped_sentinel_with_newlines_produces_diagnostic(self): + """ + Given: '**\\n{{ref:x}}\\n**' + When: check_prose_field is called + Then: ONE diagnostic with reason containing 'emphasis-wrapped sentinel' + + ADR-028 D3: '**\\n**' has both-edges whitespace -> shape='open'. + The emphasis-wrapped-sentinel predicate uses .strip() on the interior, + so leading/trailing newlines do not prevent detection. + RED until SWE pass. + """ + field = self._make_field("**\n{{ref:x}}\n**") + diags = check_prose_field(field) + assert len(diags) == 1, ( + f"Expected 1 diagnostic for newline-wrapped sentinel, got {len(diags)}: {diags!r}. RED until SWE pass." + ) + assert "emphasis-wrapped sentinel" in diags[0].reason + + # --- Tests that MUST produce ZERO diagnostics (GREEN now, faux-depth guard) --- + + def test_sibling_complete_bold_spans_produce_zero_diagnostics(self): + """ + Given: '**hello** world **goodbye**' + When: check_prose_field is called + Then: ZERO diagnostics + + ADR-028 D5 faux-depth guard: the two BOLD tokens have shape='complete' + -> depth stays at 0 throughout -> no nested-emphasis diagnostic. + GREEN now and must stay GREEN after SWE pass. + """ + field = self._make_field("**hello** world **goodbye**") + diags = check_prose_field(field) + assert len(diags) == 0, ( + f"Sibling complete bold spans must produce 0 diagnostics (faux-depth guard). Got: {diags!r}" + ) + + def test_sentinel_at_depth_zero_produces_zero_diagnostics(self): + """ + Given: '**hello** world {{ref:x}}' + When: check_prose_field is called + Then: ZERO diagnostics + + The sentinel is at depth==0 (outside any open emphasis). + ADR-028 D5: the emphasis-wrapped-sentinel predicate checks the emphasis + token's interior, not any subsequent sentinel at depth 0. + GREEN now and must stay GREEN after SWE pass. + """ + field = self._make_field("**hello** world {{ref:x}}") + diags = check_prose_field(field) + assert len(diags) == 0, f"Sentinel at depth 0 must produce 0 diagnostics. Got: {diags!r}" + + def test_clean_bold_produces_zero_diagnostics(self): + """ + Given: '**bold**' + When: check_prose_field is called + Then: ZERO diagnostics + + Simple complete-shape bold — no nesting, no sentinel inside. + GREEN now and must stay GREEN. + """ + field = self._make_field("**bold**") + diags = check_prose_field(field) + assert len(diags) == 0, f"Clean bold must produce 0 diagnostics. Got: {diags!r}" + + def test_clean_italic_asterisk_produces_zero_diagnostics(self): + """ + Given: '*italic*' + When: check_prose_field is called + Then: ZERO diagnostics + """ + field = self._make_field("*italic*") + diags = check_prose_field(field) + assert len(diags) == 0, f"Clean italic must produce 0 diagnostics. Got: {diags!r}" + + def test_clean_italic_underscore_produces_zero_diagnostics(self): + """ + Given: '_italic_' at string boundary + When: check_prose_field is called + Then: ZERO diagnostics + """ + field = self._make_field("_italic_") + diags = check_prose_field(field) + assert len(diags) == 0, f"Clean underscore italic must produce 0 diagnostics. Got: {diags!r}" + + def test_bold_containing_italic_produces_zero_diagnostics(self): + """ + Given: '**bold *italic* inside**' + When: check_prose_field is called + Then: ZERO diagnostics + + ADR-017 D1: italic inside bold is one permitted nesting level. + The tokenizer emits a single BOLD token for this span (italic-in-bold + is absorbed atomically). No depth-counter violation. + GREEN now and must stay GREEN. + """ + field = self._make_field("**bold *italic* inside**") + diags = check_prose_field(field) + assert len(diags) == 0, f"Bold-with-italic-inside must produce 0 diagnostics. Got: {diags!r}" + + +# =========================================================================== +# TestEmphasisDiagnosticFormat (RED — diagnostic format for new reasons) +# =========================================================================== +# Locks the exact diagnostic format strings for the two new reason constants +# ADR-028 D6: 'nested emphasis' and 'emphasis-wrapped sentinel', plus the +# token-snippet convention ('at '). +# =========================================================================== + + +class TestEmphasisDiagnosticFormat: + r""" + Tests for ADR-028 D6: diagnostic format for emphasis violations. + + The ADR-017 D4 format is preserved byte-for-byte: + validate-yaml-prose-subset: ::[]: + + The for emphasis violations follows the existing 'at ''' + pattern: the reason string ends with "at ''" where token.value + is the offending emphasis token's full value (including delimiters). + + All tests are RED until the SWE pass adds the depth-counter walk. + """ + + def _make_field( + self, + raw_text: str, + entry_id: str = "riskAlpha", + field_name: str = "shortDescription", + index: int = 0, + ) -> "ProseField": + """Build a ProseField with tokens from the tokenizer.""" + import sys as _sys + from pathlib import Path as _Path + + _sys.path.insert(0, str(_Path(__file__).parent.parent / "precommit")) + from precommit._prose_tokens import tokenize as _tok # noqa: PLC0415 + + tokens = _tok(raw_text) + return ProseField( + file_path=_Path("test.yaml"), + entry_id=entry_id, + field_name=field_name, + index=index, + raw_text=raw_text, + tokens=tokens, + ) + + def test_nested_emphasis_diagnostic_reason_string(self): + """ + Given: '**foo **nested** bar**' triggers nested emphasis + When: check_prose_field produces a Diagnostic + Then: reason starts with 'nested emphasis' and ends with "at '** bar**'" + + ADR-028 D6: reason string is 'nested emphasis' (unchanged); the snippet + convention follows the existing INVALID_* pattern: "at ''". + The offending token is the BOLD('** bar**') (the 'close'-shape token at + depth > 0). + RED until SWE pass. + """ + field = self._make_field("**foo **nested** bar**") + diags = check_prose_field(field) + assert len(diags) == 1, f"Expected 1 diagnostic, got {diags!r}. RED until SWE pass." + reason = diags[0].reason + assert reason.startswith("nested emphasis"), f"Reason must start with 'nested emphasis', got {reason!r}" + assert "at '** bar**'" in reason, f"Reason must contain \"at '** bar**'\", got {reason!r}" + + def test_nested_emphasis_format_diagnostic_line(self): + """ + Given: a Diagnostic for nested emphasis + When: format_diagnostic_line is called + Then: output matches ADR-017 D4 format with 'nested emphasis' reason + + Asserts the full committed format string including hook_id prefix. + RED until SWE pass. + """ + from precommit._linter_types import format_diagnostic_line # noqa: PLC0415 + + field = self._make_field( + "**foo **nested** bar**", + entry_id="riskAlpha", + field_name="shortDescription", + index=0, + ) + diags = check_prose_field(field) + assert len(diags) == 1, f"Expected 1 diagnostic. RED until SWE pass. Got: {diags!r}" + line = format_diagnostic_line(diags[0]) + # Format: validate-yaml-prose-subset: test.yaml:riskAlpha:shortDescription[0]: nested emphasis at '...' + assert line.startswith("validate-yaml-prose-subset: "), f"Expected hook_id prefix, got {line!r}" + assert "riskAlpha" in line + assert "shortDescription[0]" in line + assert "nested emphasis" in line + assert _DIAG_PATTERN.match(line), f"Diagnostic line does not match committed pattern: {line!r}" + + def test_emphasis_wrapped_sentinel_diagnostic_reason_string(self): + """ + Given: '**{{riskPromptInjection}}**' triggers emphasis-wrapped sentinel + When: check_prose_field produces a Diagnostic + Then: reason starts with 'emphasis-wrapped sentinel' and contains the token value + + ADR-028 D6: reason string is 'emphasis-wrapped sentinel'; snippet is + the full BOLD token value '**{{riskPromptInjection}}**'. + RED until SWE pass. + """ + field = self._make_field("**{{riskPromptInjection}}**") + diags = check_prose_field(field) + assert len(diags) == 1, f"Expected 1 diagnostic. RED until SWE pass. Got: {diags!r}" + reason = diags[0].reason + assert reason.startswith("emphasis-wrapped sentinel"), ( + f"Reason must start with 'emphasis-wrapped sentinel', got {reason!r}" + ) + assert "at '**{{riskPromptInjection}}**'" in reason, f"Reason must contain token snippet, got {reason!r}" + + def test_emphasis_wrapped_sentinel_format_diagnostic_line_matches_pattern(self): + """ + Given: a Diagnostic for emphasis-wrapped sentinel + When: format_diagnostic_line is called + Then: output matches the committed _DIAG_PATTERN regex + + Verifies the emphasis violation slots into the existing format contract + without modifying the pattern. RED until SWE pass. + """ + from precommit._linter_types import format_diagnostic_line # noqa: PLC0415 + + field = self._make_field( + "**{{riskPromptInjection}}**", + entry_id="riskBeta", + field_name="shortDescription", + index=1, + ) + diags = check_prose_field(field) + assert len(diags) == 1, f"Expected 1 diagnostic. RED until SWE pass. Got: {diags!r}" + line = format_diagnostic_line(diags[0]) + assert _DIAG_PATTERN.match(line), ( + f"Emphasis-wrapped-sentinel diagnostic does not match committed pattern: {line!r}" + ) From e149737953b6d5c1a1b040f18ec4474d26c0edf1 Mon Sep 17 00:00:00 2001 From: davidlabianca Date: Fri, 29 May 2026 16:02:22 +0000 Subject: [PATCH 4/9] This commit corrects the ADR-028 D5 close-branch erratum to match the predicate prose The close branch lacked the nested-emphasis emit and carried a false depth-floor comment. It now emits when depth > 0 before decrementing and floors with max(0, depth - 1), reconciling the pseudocode with D5's own predicate prose and planning-inventory section 5.2's load-bearing case. Erratum only: no section 8 lock changes; Status stays Accepted. Co-authored-by: AI Assistant --- .../adr/028-prose-linter-bracket-matching-architecture.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/docs/adr/028-prose-linter-bracket-matching-architecture.md b/docs/adr/028-prose-linter-bracket-matching-architecture.md index 0c596809..7bce1971 100644 --- a/docs/adr/028-prose-linter-bracket-matching-architecture.md +++ b/docs/adr/028-prose-linter-bracket-matching-architecture.md @@ -119,7 +119,9 @@ for token in field.tokens: emit_diagnostic(...) depth += 1 elif token.shape == "close": - depth -= 1 # depth-floor enforced by tokenizer invariants + if depth > 0: # close of an unmatched open -> nested emphasis + emit_diagnostic(...) + depth = max(0, depth - 1) # floor: a standalone close-shape (e.g. " ** bar**") would underflow elif token.shape == "complete": if depth > 0: # nested complete-emphasis inside open emphasis emit_diagnostic(...) @@ -131,7 +133,7 @@ for token in field.tokens: The two predicates: -- **Nested-emphasis predicate.** Fires when an emphasis token is encountered with `depth > 0` and a same-family emphasis open is unmatched. Implemented as the `depth > 0` check inline in the walk above. The kind comparison is via `token.kind` directly (a bare counter suffices; a kind-stack is not needed for the current ADR-017 D1 rule, which detects any nesting regardless of delimiter family). +- **Nested-emphasis predicate.** Fires when an emphasis token is encountered with `depth > 0` and a same-family emphasis open is unmatched. Implemented as the `depth > 0` check inline in the walk above, applied on the `open`, `complete`, **and** `close` branches; the `close` branch checks `depth > 0` *before* decrementing, and is the attribution point for the canonical split-token nested case (`**foo **nested** bar**` tokenizes as `[open, text, close]`, and the `close` token at `depth == 1` is where the single diagnostic lands). The kind comparison is via `token.kind` directly (a bare counter suffices; a kind-stack is not needed for the current ADR-017 D1 rule, which detects any nesting regardless of delimiter family). - **Emphasis-wrapped-sentinel predicate.** Fires when an emphasis token's interior (its `value` minus the delimiter pair) `.strip()`s to a string that matches `SENTINEL_INNER_RE` (the unified intra-or-ref regex defined once in `_prose_tokens.py` and shared with the references linter). Independent of the depth state. Both predicates are one-line expressions over `token.shape` (or `token.kind` and `token.value`) and the depth state. There are no regex constants in the linter for shape detection. The whitespace-adjacency heuristic that previously lived in `_RE_*_EARLY_CLOSE` has been moved into the tokenizer's `_classify_emphasis_shape` per D3; the linter cannot see it and cannot drift from it. @@ -140,6 +142,8 @@ The walk handles the false-positive patterns a naive stack would faux-depth on. Reason strings (`_REASON_NESTED_EMPHASIS`, `_REASON_EMPHASIS_WRAPPED_SENTINEL`) and the diagnostic format are preserved per D6. +**Addendum (2026-05-29) — erratum.** The original D5 draft omitted the `close`-branch emit and the depth floor from the pseudocode, leaving the pseudocode inconsistent with the Nested-emphasis predicate prose (which has always covered every emphasis token, including `close`-shape) and with planning-inventory §5.2's load-bearing-case analysis of `**foo **nested** bar**` (tokens `[open, text, close]`, where the `close` token at `depth == 1` is the only attribution point). The pseudocode now (1) emits the nested-emphasis diagnostic in the `close` branch when `depth > 0`, checked before the decrement, and (2) floors the decrement with `max(0, depth - 1)` because a standalone leading-space bold classifies as `shape="close"` and would otherwise drive `depth` to `-1` (the prior `# depth-floor enforced by tokenizer invariants` comment was false). This is an erratum reconciling the pseudocode with D5's own governing predicate prose; it is **not** a new decision — §5.2 flagged "clarify the predicate" but §8 never created a corresponding locked D-Open decision, so no §8 lock changes. Status remains **Accepted**. + ### D6. Diagnostic conformance ADR-017 D4's diagnostic format spec is preserved byte-for-byte: From cb170b0d4db427c69116882f0eb10327738f846f Mon Sep 17 00:00:00 2001 From: davidlabianca Date: Fri, 29 May 2026 16:16:20 +0000 Subject: [PATCH 5/9] This commit implements the ADR-028 Token.shape contract and depth-counter prose linter (GREEN) Turns the RED suite from cc7df7d green by building the Accepted ADR-028 design. Tokenizer (scripts/hooks/precommit/_prose_tokens.py): - Token NamedTuple gains shape: Literal["complete","open","close","neutral"] as the third field, default "neutral" (D1 / D-Open-1..4); two-positional construction stays valid and equality stays structural. - _classify_emphasis_shape() classifies a matched span from interior edge whitespace via .isspace(), no regex (D3 / D-Open-5..7); both-edges -> "open". - The three emphasis emission sites pass shape= (D3 / Section 5.3). - _RE_ITALIC_UNDERSCORE tightened to require whitespace-or-boundary flanking so intraword \S_\S no longer tokenizes as italic (D3 invariant 3 / D-Open-21, cosai-simpler form); __double__ stays TEXT; asterisk italic untouched. Linter (scripts/hooks/precommit/validate_yaml_prose_subset.py): - check_prose_field gains the D5 single-pass depth-counter walk with a bare integer counter (D-Open-10) and the two one-line predicates; reason strings "nested emphasis" / "emphasis-wrapped sentinel" route through the existing Diagnostic + format_diagnostic_line, byte-for-byte preserved (D6). - The close branch emits when depth > 0 before decrementing, with a max(0, depth-1) floor, per ADR-028 D5 as amended 2026-05-29 (e149737). - The wrapped-sentinel predicate reuses the tokenizer's own sentinel-inner patterns; no new emphasis-shape re.compile is introduced. This is a fresh build on the clean base (D-Open-11): the pre-existing regex-driven emphasis layer described in the ADR's Consequences never landed on this branch (it lived on the abandoned R-commits a71ae47 / 817e00d / 5e72a2a), so the design is implemented directly rather than refactored. Live corpus stays at zero diagnostics for both prose linters in --block; the test docstring at test_nested_bold_produces_one_nested_emphasis_diagnostic is reconciled with the amended D5. Co-authored-by: AI Assistant --- scripts/hooks/precommit/_prose_tokens.py | 77 +++++++++-- .../precommit/validate_yaml_prose_subset.py | 126 ++++++++++++++++-- .../tests/test_validate_yaml_prose_subset.py | 15 +-- 3 files changed, 190 insertions(+), 28 deletions(-) diff --git a/scripts/hooks/precommit/_prose_tokens.py b/scripts/hooks/precommit/_prose_tokens.py index 7d149dba..b31f1eeb 100644 --- a/scripts/hooks/precommit/_prose_tokens.py +++ b/scripts/hooks/precommit/_prose_tokens.py @@ -38,7 +38,7 @@ import re from enum import Enum -from typing import NamedTuple +from typing import Literal, NamedTuple class TokenKind(Enum): @@ -89,10 +89,15 @@ class Token(NamedTuple): Attributes: kind: The token's classification (accepting or rejecting). value: The exact substring from the input that this token covers. + shape: Emphasis classification per ADR-028 D3. 'neutral' for every + non-emphasis token. One of 'complete', 'open', 'close', 'neutral'. + Default 'neutral' so two-positional construction Token(kind, value) + still compiles and yields a token with shape='neutral'. """ kind: TokenKind value: str + shape: Literal["complete", "open", "close", "neutral"] = "neutral" # --------------------------------------------------------------------------- @@ -177,8 +182,18 @@ class Token(NamedTuple): _RE_ITALIC_ASTERISK = re.compile(r"\*(.+?)\*", re.DOTALL) # Italic underscore: _..._ — single underscore only; __ is NOT italic (ADR-017 D1). -# Lookahead/lookbehind prevent matching when adjacent to another underscore. -_RE_ITALIC_UNDERSCORE = re.compile(r"(? Literal["complete", "open", "close", "neutral"]: + """Classify the emphasis shape of a matched span by examining interior edge whitespace. + + The tokenizer calls this at emission time for BOLD, ITALIC-asterisk, and + ITALIC-underscore tokens. The shape drives the depth-counter walk in the + prose-subset linter (ADR-028 D5). + + Rules (ADR-028 D3 table): + - Both edges whitespace -> 'open' (convention: leading test fires first) + - Trailing whitespace only -> 'open' (greedy close on intended inner open) + - Leading whitespace only -> 'close' (trailing half of an early-closed match) + - Neither edge whitespace -> 'complete' (well-formed span) + - Empty interior -> 'neutral' (defensive; not emitted in practice) + + Uses str.isspace() — no regex (ADR-028 D-Open-6). + + Args: + span: The full matched span including delimiters (e.g. '**foo **'). + delim: The delimiter string ('**', '*', or '_'). + + Returns: + One of 'complete', 'open', 'close', 'neutral'. + """ + interior = span[len(delim) : -len(delim)] + if not interior: + return "neutral" + leading_ws = interior[0].isspace() + trailing_ws = interior[-1].isspace() + if leading_ws and trailing_ws: + return "open" + if leading_ws: + return "close" + if trailing_ws: + return "open" + return "complete" + + # --------------------------------------------------------------------------- # Sentinel helper # --------------------------------------------------------------------------- @@ -263,7 +320,7 @@ def tokenize(text: str) -> list[Token]: The `text` argument is expected to be a single prose field value as decoded by PyYAML — not raw YAML, not a file path. - Test fixtures for all 42 grammar cases live at: + Test fixtures live at: scripts/hooks/tests/fixtures/prose_subset/ Args: @@ -286,11 +343,11 @@ def flush_text(end: int) -> None: tokens.append(Token(TokenKind.TEXT, text[pending_text_start:end])) pending_text_start = -1 - def emit(kind: TokenKind, value: str) -> None: - """Flush any pending TEXT, then emit the given token.""" + def emit(kind: TokenKind, value: str, *, shape: str = "neutral") -> None: + """Flush any pending TEXT, then emit the given token with the given shape.""" nonlocal i flush_text(i) - tokens.append(Token(kind, value)) + tokens.append(Token(kind, value, shape)) i += len(value) def at_line_start() -> bool: @@ -422,21 +479,21 @@ def at_line_start() -> bool: if ch == "*" and i + 1 < len(text) and text[i + 1] == "*": m = _RE_BOLD.match(text, i) if m: - emit(TokenKind.BOLD, m.group()) + emit(TokenKind.BOLD, m.group(), shape=_classify_emphasis_shape(m.group(), "**")) continue # --- Rule 13: Italic asterisk *...* --- if ch == "*": m = _RE_ITALIC_ASTERISK.match(text, i) if m: - emit(TokenKind.ITALIC, m.group()) + emit(TokenKind.ITALIC, m.group(), shape=_classify_emphasis_shape(m.group(), "*")) continue # --- Rule 14: Italic underscore _..._ (single underscore only) --- if ch == "_": m = _RE_ITALIC_UNDERSCORE.match(text, i) if m: - emit(TokenKind.ITALIC, m.group()) + emit(TokenKind.ITALIC, m.group(), shape=_classify_emphasis_shape(m.group(), "_")) continue # --- Rule 15: Bare camelCase entity-prefix identifier --- diff --git a/scripts/hooks/precommit/validate_yaml_prose_subset.py b/scripts/hooks/precommit/validate_yaml_prose_subset.py index 09acff0f..2c7ef7e6 100644 --- a/scripts/hooks/precommit/validate_yaml_prose_subset.py +++ b/scripts/hooks/precommit/validate_yaml_prose_subset.py @@ -30,7 +30,20 @@ from precommit._linter_types import Diagnostic, ProseField, format_diagnostic_line # noqa: E402 from precommit._prose_fields import find_prose_fields # noqa: E402 -from precommit._prose_tokens import TokenKind # noqa: E402 +from precommit._prose_tokens import ( # noqa: E402 + _RE_SENTINEL_INTRA_INNER, + _RE_SENTINEL_REF_INNER, + TokenKind, +) + +# Deliberate cross-module coupling: _RE_SENTINEL_INTRA_INNER and +# _RE_SENTINEL_REF_INNER are internal to _prose_tokens (leading-underscore per +# ADR-028 D4). The wrapped-sentinel predicate (ADR-028 D5) reuses them directly +# so the linter's notion of a "sentinel" cannot drift from the tokenizer's own +# classification. They are NOT promoted to public constants — ADR-028 D4 fixes +# the public surface of _prose_tokens at exactly Token, TokenKind, and tokenize(); +# a consumer importing these _RE_* names accepts the reorganization-coupling risk +# that D4 describes. # Re-export so callers can import ProseField and Diagnostic from this module # (the test suite imports both from here, not from _linter_types). @@ -87,6 +100,58 @@ "If you added a new INVALID_* kind to _REJECTED_KINDS, add its reason to _REASONS too." ) +# Reason strings for emphasis violations (ADR-028 D6). These are stable +# constants; any change requires a D6 amendment. +_REASON_NESTED_EMPHASIS = "nested emphasis" +_REASON_EMPHASIS_WRAPPED_SENTINEL = "emphasis-wrapped sentinel" + +# The two emphasis token kinds; used in the depth-counter walk (ADR-028 D5). +_EMPHASIS_KINDS: frozenset[TokenKind] = frozenset({TokenKind.BOLD, TokenKind.ITALIC}) + + +def _is_emphasis_wrapped_sentinel(token_value: str, delim: str) -> bool: + """Return True if the emphasis token wraps exactly one sentinel. + + Strips the emphasis delimiter pair from token_value, .strip()s whitespace, + then checks whether the result is a `{{ }}` span whose inner content + fullmatches either the intra-doc or ref sentinel inner regex. + + This mirrors how _match_sentinel classifies sentinels: outer {{ }} are + stripped first, then the inner content is matched against the patterns. + + Args: + token_value: The full emphasis token value including delimiters. + delim: The delimiter string ('**', '*', or '_'). + + Returns: + True if the stripped interior is a well-formed sentinel. + """ + interior = token_value[len(delim) : -len(delim)].strip() + # Interior must be wrapped in {{ }} to be a sentinel form. + if not (interior.startswith("{{") and interior.endswith("}}")): + return False + inner = interior[2:-2] + return bool(_RE_SENTINEL_INTRA_INNER.fullmatch(inner) or _RE_SENTINEL_REF_INNER.fullmatch(inner)) + + +def _delim_for_token(token_value: str) -> str: + """Return the delimiter prefix for an emphasis token value. + + Inspects the leading characters to distinguish '**' (BOLD) from '*' (ITALIC + asterisk) from '_' (ITALIC underscore). + + Args: + token_value: The full token value string. + + Returns: + The delimiter string: '**', '*', or '_'. + """ + if token_value.startswith("**"): + return "**" + if token_value.startswith("*"): + return "*" + return "_" + def check_prose_field(field: ProseField) -> list[Diagnostic]: """Check one ProseField against the ADR-017 D4 grammar rejection rules. @@ -96,6 +161,9 @@ def check_prose_field(field: ProseField) -> list[Diagnostic]: — ADR-017 D4 rule 5 delegates bare-camelCase rejection to validate_prose_references. + Also runs the ADR-028 D5 depth-counter emphasis-rejection walk, emitting + diagnostics for nested emphasis and emphasis-wrapped sentinels. + Args: field: A ProseField with tokens already populated by tokenize(). @@ -103,14 +171,8 @@ def check_prose_field(field: ProseField) -> list[Diagnostic]: List of Diagnostic objects (empty if the field is clean). """ diagnostics: list[Diagnostic] = [] - for token in field.tokens: - if token.kind not in _REJECTED_KINDS: - continue - base_reason = _REASONS[token.kind] - # ADR-017 D4: append the offending token value as a snippet for context. - # Only append when token.value is non-empty (tokenizer guarantees this, - # but guard defensively to avoid "at ''" in edge cases). - reason = f"{base_reason} at {token.value!r}" if token.value else base_reason + + def _emit_diag(reason: str) -> None: diagnostics.append( Diagnostic( hook_id=_HOOK_ID, @@ -122,6 +184,52 @@ def check_prose_field(field: ProseField) -> list[Diagnostic]: nested_index=field.nested_index, ) ) + + # --- INVALID_* token rejection (ADR-017 D4) --- + for token in field.tokens: + if token.kind not in _REJECTED_KINDS: + continue + base_reason = _REASONS[token.kind] + # ADR-017 D4: append the offending token value as a snippet for context. + # Only append when token.value is non-empty (tokenizer guarantees this, + # but guard defensively to avoid "at ''" in edge cases). + reason = f"{base_reason} at {token.value!r}" if token.value else base_reason + _emit_diag(reason) + + # --- ADR-028 D5 depth-counter emphasis walk --- + # Single pass over the token stream with a bare integer depth counter. + # Emphasis tokens with shape='open' increment depth; 'close' decrements. + # Any emphasis token arriving at depth > 0 is a nested-emphasis violation. + # The wrapped-sentinel predicate is independent of depth state. + depth = 0 + for token in field.tokens: + if token.kind not in _EMPHASIS_KINDS: + continue + + # Nested-emphasis predicate (ADR-028 D5). + if token.shape == "open": + if depth > 0: + _emit_diag(f"{_REASON_NESTED_EMPHASIS} at {token.value!r}") + depth += 1 + elif token.shape == "close": + # Check before decrementing: the close token is the one arriving + # at depth > 0 in the canonical [open, text, close] stream. + # This is the authoritative spec from TestNestedEmphasisRejection + # (the ADR D5 pseudocode omits the emit on close, but the test + # requires it — the test is canonical per the handoff note). + if depth > 0: + _emit_diag(f"{_REASON_NESTED_EMPHASIS} at {token.value!r}") + depth = max(0, depth - 1) + elif token.shape == "complete": + if depth > 0: + _emit_diag(f"{_REASON_NESTED_EMPHASIS} at {token.value!r}") + # complete = open + close, net depth change 0 + + # Emphasis-wrapped-sentinel predicate (independent of depth state). + delim = _delim_for_token(token.value) + if _is_emphasis_wrapped_sentinel(token.value, delim): + _emit_diag(f"{_REASON_EMPHASIS_WRAPPED_SENTINEL} at {token.value!r}") + return diagnostics diff --git a/scripts/hooks/tests/test_validate_yaml_prose_subset.py b/scripts/hooks/tests/test_validate_yaml_prose_subset.py index c03a8ec3..c3de38ba 100644 --- a/scripts/hooks/tests/test_validate_yaml_prose_subset.py +++ b/scripts/hooks/tests/test_validate_yaml_prose_subset.py @@ -2159,15 +2159,12 @@ def test_nested_bold_produces_one_nested_emphasis_diagnostic(self): ADR-028 D5: BOLD('**foo **') has shape='open' -> depth 0->1. BOLD('** bar**') has shape='close' and arrives at depth==1 -> nested emphasis. - Implementation note — close branch emits the diagnostic: - The D5 pseudocode's close branch shows only `depth -= 1` with no - emit_diagnostic. However, BOLD('** bar**') is the ONLY token in the - stream [open, text, close] that arrives at depth > 0 (depth==1 before - the decrement). The correct implementation MUST emit the diagnostic in - the close branch when depth > 0 (checking before decrementing). - ADR-017 D1 requires nested bold to be rejected; THIS TEST is the - authoritative outcome spec. The snippet in the reason is the close - token's value: "at '** bar**'". + Close-branch emit: ADR-028 D5 (as amended 2026-05-29) requires the + close-branch emit when depth > 0, checked before the decrement; this + test verifies it. BOLD('** bar**') is the only token in the stream + [open, text, close] that arrives at depth > 0 (depth==1 before the + decrement), so the single diagnostic's snippet is the close token's + value: "at '** bar**'". RED: check_prose_field has no emphasis logic yet. """ From 5e79d87178303baa71bf6330f65d22ab5f5b8c92 Mon Sep 17 00:00:00 2001 From: davidlabianca Date: Fri, 29 May 2026 16:25:44 +0000 Subject: [PATCH 6/9] This commit corrects the ADR-028 D5 wrapped-sentinel prose and a stale test docstring D5 named a non-existent unified SENTINEL_INNER_RE 'shared with the references linter'. Reality: two internal _RE_SENTINEL_*_INNER constants imported by the prose-subset linter via documented coupling (sanctioned by D4's internal-_RE_* posture), not used by the references linter (which resolves via _resolve_intra_sentinel). Adds a second D5 Addendum and fixes the matching docstring. Doc-only erratum: impl unchanged, no section 8 lock change, Status stays Accepted. Co-authored-by: AI Assistant --- docs/adr/028-prose-linter-bracket-matching-architecture.md | 4 +++- scripts/hooks/tests/test_validate_yaml_prose_subset.py | 3 ++- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/docs/adr/028-prose-linter-bracket-matching-architecture.md b/docs/adr/028-prose-linter-bracket-matching-architecture.md index 7bce1971..ccc1c80d 100644 --- a/docs/adr/028-prose-linter-bracket-matching-architecture.md +++ b/docs/adr/028-prose-linter-bracket-matching-architecture.md @@ -134,7 +134,7 @@ for token in field.tokens: The two predicates: - **Nested-emphasis predicate.** Fires when an emphasis token is encountered with `depth > 0` and a same-family emphasis open is unmatched. Implemented as the `depth > 0` check inline in the walk above, applied on the `open`, `complete`, **and** `close` branches; the `close` branch checks `depth > 0` *before* decrementing, and is the attribution point for the canonical split-token nested case (`**foo **nested** bar**` tokenizes as `[open, text, close]`, and the `close` token at `depth == 1` is where the single diagnostic lands). The kind comparison is via `token.kind` directly (a bare counter suffices; a kind-stack is not needed for the current ADR-017 D1 rule, which detects any nesting regardless of delimiter family). -- **Emphasis-wrapped-sentinel predicate.** Fires when an emphasis token's interior (its `value` minus the delimiter pair) `.strip()`s to a string that matches `SENTINEL_INNER_RE` (the unified intra-or-ref regex defined once in `_prose_tokens.py` and shared with the references linter). Independent of the depth state. +- **Emphasis-wrapped-sentinel predicate.** Fires when an emphasis token's interior (its `value` minus the delimiter pair) `.strip()`s to a string that fullmatches either of the tokenizer's two internal sentinel-inner regexes — `_RE_SENTINEL_INTRA_INNER` or `_RE_SENTINEL_REF_INNER` (defined in `_prose_tokens.py`). The prose-subset linter imports these `_RE_*` constants directly: a deliberate cross-module coupling permitted by D4 (the `_RE_*` constants are internal and reorganizable) and flagged with an inline coupling comment at the import site. They are **not** shared with the references linter, which resolves sentinels structurally via `_resolve_intra_sentinel` against the id-index rather than by inner-regex match. Independent of the depth state. Both predicates are one-line expressions over `token.shape` (or `token.kind` and `token.value`) and the depth state. There are no regex constants in the linter for shape detection. The whitespace-adjacency heuristic that previously lived in `_RE_*_EARLY_CLOSE` has been moved into the tokenizer's `_classify_emphasis_shape` per D3; the linter cannot see it and cannot drift from it. @@ -144,6 +144,8 @@ Reason strings (`_REASON_NESTED_EMPHASIS`, `_REASON_EMPHASIS_WRAPPED_SENTINEL`) **Addendum (2026-05-29) — erratum.** The original D5 draft omitted the `close`-branch emit and the depth floor from the pseudocode, leaving the pseudocode inconsistent with the Nested-emphasis predicate prose (which has always covered every emphasis token, including `close`-shape) and with planning-inventory §5.2's load-bearing-case analysis of `**foo **nested** bar**` (tokens `[open, text, close]`, where the `close` token at `depth == 1` is the only attribution point). The pseudocode now (1) emits the nested-emphasis diagnostic in the `close` branch when `depth > 0`, checked before the decrement, and (2) floors the decrement with `max(0, depth - 1)` because a standalone leading-space bold classifies as `shape="close"` and would otherwise drive `depth` to `-1` (the prior `# depth-floor enforced by tokenizer invariants` comment was false). This is an erratum reconciling the pseudocode with D5's own governing predicate prose; it is **not** a new decision — §5.2 flagged "clarify the predicate" but §8 never created a corresponding locked D-Open decision, so no §8 lock changes. Status remains **Accepted**. +**Addendum (2026-05-29) — second erratum.** The original D5 Emphasis-wrapped-sentinel predicate bullet named a single unified `SENTINEL_INNER_RE` "defined once in `_prose_tokens.py` and shared with the references linter." Both claims were false against the as-built code. No `SENTINEL_INNER_RE` exists; the tokenizer carries two internal constants, `_RE_SENTINEL_INTRA_INNER` and `_RE_SENTINEL_REF_INNER` (`_prose_tokens.py:174-175`), which the prose-subset linter imports directly with an inline coupling comment (`validate_yaml_prose_subset.py:39-45`). That direct import of internal `_RE_*` constants is exactly the coupling D4 sanctions (the `_RE_*` constants are internal and reorganizable), and a single public `SENTINEL_INNER_RE` would have been a 4th public name contradicting D4's "exactly three names" surface. The constants are **not** shared with the references linter, which resolves sentinels structurally via `_resolve_intra_sentinel` (prefix + id-index, dispatched on `token.kind`) rather than by inner-regex match. The predicate bullet above is corrected to describe this reality. This is a doc-accuracy erratum only: the implementation already matches the corrected text, no code changes, no §8 lock changes, and Status remains **Accepted**. + ### D6. Diagnostic conformance ADR-017 D4's diagnostic format spec is preserved byte-for-byte: diff --git a/scripts/hooks/tests/test_validate_yaml_prose_subset.py b/scripts/hooks/tests/test_validate_yaml_prose_subset.py index c3de38ba..5dd1dd41 100644 --- a/scripts/hooks/tests/test_validate_yaml_prose_subset.py +++ b/scripts/hooks/tests/test_validate_yaml_prose_subset.py @@ -2259,7 +2259,8 @@ def test_emphasis_wrapped_ref_sentinel_produces_diagnostic(self): The wrapped-sentinel predicate applies to both SENTINEL_INTRA and SENTINEL_REF inner forms — the test strips the delimiter pair and - fullmatches the unified SENTINEL_INNER_RE. + fullmatches the tokenizer's two internal regexes, + _RE_SENTINEL_INTRA_INNER and _RE_SENTINEL_REF_INNER. RED until SWE pass. """ field = self._make_field("**{{ref:x}}**") From 455f364ee5bb24c6e9c0fd083eff9df27613f05e Mon Sep 17 00:00:00 2001 From: davidlabianca Date: Fri, 29 May 2026 18:22:50 +0000 Subject: [PATCH 7/9] This commit removes stale RED-phase scaffolding language from the ADR-028 test and comment prose MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The ADR-028 emphasis feature is implemented and green; comment and docstring text that narrated the transient test-first chain state ("RED until …", "after the … pass adds …", class-header "(RED — …)" annotations) is now stale and is reworded to describe what each test guards, in the present tense. - validate_yaml_prose_subset.py: the close-branch comment no longer claims the ADR D5 pseudocode "omits the emit on close" / "the test is canonical" — that predated the D5 amendment (e149737). It now states the close token is the attribution point for the canonical [open, text, close] nested case and cites ADR-028 D5 as amended 2026-05-29. - test_prose_tokens.py + test_validate_yaml_prose_subset.py: ~37 docstring, comment, and assertion-message sites reworded. Doc-only — no assertion expression, expected value, or logic changed; the full suite stays at 2576 passed / 6 skipped. Co-authored-by: AI Assistant --- .../precommit/validate_yaml_prose_subset.py | 9 +- scripts/hooks/tests/test_prose_tokens.py | 94 ++++++++----------- .../tests/test_validate_yaml_prose_subset.py | 83 +++++----------- 3 files changed, 67 insertions(+), 119 deletions(-) diff --git a/scripts/hooks/precommit/validate_yaml_prose_subset.py b/scripts/hooks/precommit/validate_yaml_prose_subset.py index 2c7ef7e6..dcab341a 100644 --- a/scripts/hooks/precommit/validate_yaml_prose_subset.py +++ b/scripts/hooks/precommit/validate_yaml_prose_subset.py @@ -213,10 +213,11 @@ def _emit_diag(reason: str) -> None: depth += 1 elif token.shape == "close": # Check before decrementing: the close token is the one arriving - # at depth > 0 in the canonical [open, text, close] stream. - # This is the authoritative spec from TestNestedEmphasisRejection - # (the ADR D5 pseudocode omits the emit on close, but the test - # requires it — the test is canonical per the handoff note). + # at depth > 0 in the canonical [open, text, close] stream + # (e.g. **foo **nested** bar**), so it is the attribution point for + # the single nested-emphasis diagnostic. ADR-028 D5 (as amended + # 2026-05-29) emits in the close branch when depth > 0, before the + # decrement. if depth > 0: _emit_diag(f"{_REASON_NESTED_EMPHASIS} at {token.value!r}") depth = max(0, depth - 1) diff --git a/scripts/hooks/tests/test_prose_tokens.py b/scripts/hooks/tests/test_prose_tokens.py index 4e6bb805..8b61e6c8 100644 --- a/scripts/hooks/tests/test_prose_tokens.py +++ b/scripts/hooks/tests/test_prose_tokens.py @@ -1935,12 +1935,11 @@ def test_newline_only(self): # =========================================================================== -# TestTripleAsteriskProbe (Spike S1 — descriptive locks, GREEN now) +# TestTripleAsteriskProbe (Spike S1 — descriptive locks) # =========================================================================== # These tests lock the ground-truth token streams for six triple-asterisk -# inputs. They describe what the CURRENT tokenizer emits (before the SWE -# pass), so they are green now and must remain green after the SWE pass -# (the tokenizer's emphasis emission sites are being extended, not changed). +# inputs. They record the tokenizer's current output (emphasis emission +# sites are extended, not changed). # Feeds into TestEmphasisShapeClassification fixture grounding. # =========================================================================== @@ -1952,8 +1951,7 @@ class TestTripleAsteriskProbe: ADR-028 §9.3 Spike S1 — run empirically before classifier tests are written to ensure the _classify_emphasis_shape helper handles each case consistently. - All assertions are descriptive: they record the observed stream. They are - GREEN against the current tokenizer and must stay GREEN after the SWE pass. + All assertions are descriptive: they record the observed stream. """ def test_triple_star_foo_tokenizes_as_bold_plus_trailing_text(self): @@ -1964,7 +1962,7 @@ def test_triple_star_foo_tokenizes_as_bold_plus_trailing_text(self): The non-greedy _RE_BOLD closes at the first '**' after open, consuming '***foo**' (interior='*foo'). The final lone '*' is TEXT. - Shape (post-SWE): the BOLD interior '*foo' has no edge whitespace -> + Shape: the BOLD interior '*foo' has no edge whitespace -> _classify_emphasis_shape yields 'complete'. """ _require_module() @@ -1986,7 +1984,7 @@ def test_four_stars_tokenizes_as_italic_plus_trailing_text(self): which fails (.+?). Rule 13 (italic asterisk) then fires on the lone '*' and _RE_ITALIC_ASTERISK matches '***' (interior='*'). The remaining '*' is TEXT. - Shape (post-SWE): ITALIC interior '*' has no edge whitespace -> 'complete'. + Shape: ITALIC interior '*' has no edge whitespace -> 'complete'. """ _require_module() tokens = tokenize("****") @@ -2006,7 +2004,7 @@ def test_five_star_foo_tokenizes_as_bold_text_bold(self): Given '*****foo*****', opening at i=0, the first '**' close is at i=3 (positions 3-4), consuming '*****' (interior='***'). TEXT('foo'). Then BOLD('*****') again. - Shape (post-SWE): interior '***' has no edge whitespace -> 'complete'. + Shape: interior '***' has no edge whitespace -> 'complete'. """ _require_module() tokens = tokenize("*****foo*****") @@ -2025,7 +2023,7 @@ def test_bold_then_one_star_tokenizes_as_bold_plus_text(self): Then: BOLD('**foo**') + TEXT('*') Standard bold match; trailing lone '*' is TEXT. - Shape (post-SWE): interior 'foo' has no edge whitespace -> 'complete'. + Shape: interior 'foo' has no edge whitespace -> 'complete'. """ _require_module() tokens = tokenize("**foo***") @@ -2044,6 +2042,7 @@ def test_one_star_then_bold_tokenizes_as_single_bold(self): _RE_BOLD opens at '**' (positions 0-1), (.+?) matches '*foo' (leading '*' is inner content), closes at '**' (positions 5-6). Full span is '***foo**'. Interior is '*foo'; no edge whitespace -> 'complete'. + Shape: 'complete'. """ _require_module() tokens = tokenize("***foo**") @@ -2059,7 +2058,7 @@ def test_three_stars_alone_tokenizes_as_italic(self): Bold rule fails (no closing '**' after inner content). Italic-asterisk rule succeeds: _RE_ITALIC_ASTERISK matches '*...*' = '***' (interior='*'). - Shape (post-SWE): interior '*' has no edge whitespace -> 'complete'. + Shape: interior '*' has no edge whitespace -> 'complete'. """ _require_module() tokens = tokenize("***") @@ -2069,14 +2068,13 @@ def test_three_stars_alone_tokenizes_as_italic(self): # =========================================================================== -# TestEmphasisShapeClassification (RED — _classify_emphasis_shape not yet) +# TestEmphasisShapeClassification — ADR-028 D3 shape classifier # =========================================================================== # These tests import _classify_emphasis_shape from precommit._prose_tokens. -# The helper does not exist until the SWE pass, so the import block below -# uses the same lazy-guard pattern as the module's top-level _IMPORT_ERROR -# guard: a collection-time import failure sets _CLASSIFY_IMPORT_ERROR and -# each test calls _require_classify() to fail with an assertion error rather -# than a collection crash. +# The import block below uses the same lazy-guard pattern as the module's +# top-level _IMPORT_ERROR guard: a collection-time import failure sets +# _CLASSIFY_IMPORT_ERROR and each test calls _require_classify() to fail +# with a clear message rather than a collection crash. # =========================================================================== _CLASSIFY_IMPORT_ERROR: ImportError | None = None @@ -2091,9 +2089,9 @@ def _require_classify() -> None: """Fail with assertion if _classify_emphasis_shape could not be imported.""" if _CLASSIFY_IMPORT_ERROR is not None: pytest.fail( - f"_classify_emphasis_shape not yet importable from precommit._prose_tokens.\n" + f"_classify_emphasis_shape not importable from precommit._prose_tokens.\n" f"Original error: {_CLASSIFY_IMPORT_ERROR}\n" - "This test is RED until the SWE pass adds the helper." + "This indicates _classify_emphasis_shape is missing from precommit._prose_tokens." ) @@ -2104,8 +2102,7 @@ class TestEmphasisShapeClassification: ADR-028 D3 shape rules, D-Open-16 (12-14 cases), D-Open-7 (both-edges -> 'open'). Grid: delimiter {'**', '*', '_'} x shape {'complete', 'open', 'close'} - plus edge cases from Spike S1 (triple-asterisk inputs). All tests are RED - until the SWE pass adds _classify_emphasis_shape to _prose_tokens.py. + plus edge cases from Spike S1 (triple-asterisk inputs). No `neutral`-return case is tested because the classifier only runs on matched emphasis spans; non-emphasis tokens carry shape='neutral' by @@ -2279,12 +2276,11 @@ def test_five_star_bold_pure_star_interior_is_complete(self): # =========================================================================== -# TestTokenShapeField (RED — Token.shape field not yet on NamedTuple) +# TestTokenShapeField — ADR-028 D1 Token.shape wire-up # =========================================================================== # ADR-028 D1: Token gains shape: Literal['complete','open','close','neutral'] # as third field with default 'neutral'. These tests call tokenize() and # inspect .shape on the resulting tokens, satisfying ADR-025 D10 wire-up. -# They fail now (AttributeError: Token has no attribute 'shape'). # =========================================================================== @@ -2293,14 +2289,12 @@ class TestTokenShapeField: Wire-up tests: tokenize() emits tokens whose .shape field carries the ADR-028 D3 classification. Satisfies ADR-025 D10 (at least one test that calls tokenize() and reads .shape on the result). - - All tests are RED until the SWE pass adds Token.shape. """ def _assert_has_shape(self, token: object) -> None: """Assert token has a .shape attribute; fail with diagnostic if not.""" assert hasattr(token, "shape"), ( - f"Token {token!r} has no .shape attribute. RED: this test passes after the SWE pass adds Token.shape." + f"Token {token!r} has no .shape attribute (ADR-028 D1 requires Token.shape)." ) def test_simple_bold_has_complete_shape(self): @@ -2436,12 +2430,11 @@ def test_both_edges_whitespace_bold_has_open_shape(self): # =========================================================================== -# TestIntrawordUnderscoreRejection (RED — D-Open-21, regex not yet tightened) +# TestIntrawordUnderscoreRejection — ADR-028 D3 invariant 3, D-Open-21 # =========================================================================== # ADR-028 D3 invariant 3: intraword \S_\S does NOT qualify as an italic -# delimiter. The current _RE_ITALIC_UNDERSCORE uses only adjacent-underscore -# negative lookahead (so '__' is excluded) but does NOT exclude non-whitespace -# flanking. After the SWE pass tightens the regex, these tests go GREEN. +# delimiter. _RE_ITALIC_UNDERSCORE requires whitespace-or-boundary flanking +# on the opening '_' (left side) and closing '_' (right side). # =========================================================================== @@ -2449,46 +2442,40 @@ class TestIntrawordUnderscoreRejection: r""" Tests for D-Open-21: tightened _RE_ITALIC_UNDERSCORE. - After the SWE pass, the regex requires whitespace-or-boundary flanking on - the opening '_' (left side) and closing '_' (right side). Intraword + _RE_ITALIC_UNDERSCORE requires whitespace-or-boundary flanking on the + opening '_' (left side) and closing '_' (right side). Intraword underscore pairs like 'home_bar and foo_baz' must NOT tokenize as ITALIC. - - Tests that assert NO ITALIC are RED now (current regex produces ITALIC). - Tests that assert ITALIC remains for whitespace-flanked forms are GREEN now. """ def test_intraword_underscore_pair_produces_no_italic(self): """ Given: 'home_bar and foo_baz' - When: tokenize() is called (after SWE tightens _RE_ITALIC_UNDERSCORE) + When: tokenize() is called Then: NO ITALIC token in the stream - Current behavior (RED): tokenizer matches '_bar and foo_' as ITALIC. - Expected behavior (GREEN after SWE): all tokens are TEXT. + _RE_ITALIC_UNDERSCORE requires whitespace-or-boundary flanking; + intraword underscores do not satisfy this — all tokens are TEXT. """ _require_module() tokens = tokenize("home_bar and foo_baz") italic_tokens = [t for t in tokens if t.kind == TokenKind.ITALIC] assert len(italic_tokens) == 0, ( - f"D-Open-21: intraword '_' must NOT produce ITALIC. " - f"Got ITALIC tokens: {italic_tokens!r}. " - "RED until SWE tightens _RE_ITALIC_UNDERSCORE." + f"D-Open-21: intraword '_' must NOT produce ITALIC. Got ITALIC tokens: {italic_tokens!r}." ) def test_adjacent_intraword_underscore_produces_no_italic(self): """ Given: 'a_b_c' (both underscores intraword) - When: tokenize() is called (after SWE tightens _RE_ITALIC_UNDERSCORE) + When: tokenize() is called Then: NO ITALIC token — entire string is TEXT - Current behavior (RED): '_b_' is emitted as ITALIC. + Both underscores have non-whitespace flanking characters; the + tightened regex rejects them as italic delimiters. """ _require_module() tokens = tokenize("a_b_c") italic_tokens = [t for t in tokens if t.kind == TokenKind.ITALIC] - assert len(italic_tokens) == 0, ( - f"D-Open-21: 'a_b_c' must produce no ITALIC. Got: {italic_tokens!r}. RED until SWE tightens regex." - ) + assert len(italic_tokens) == 0, f"D-Open-21: 'a_b_c' must produce no ITALIC. Got: {italic_tokens!r}." def test_whitespace_flanked_underscore_italic_still_tokenizes(self): """ @@ -2496,8 +2483,8 @@ def test_whitespace_flanked_underscore_italic_still_tokenizes(self): When: tokenize() is called Then: ITALIC('_italic_') is still produced - This case is GREEN now and must remain GREEN after the SWE pass. - Whitespace flanking is the canonical permitted form. + Whitespace flanking satisfies _RE_ITALIC_UNDERSCORE's flanking + requirement — this is the canonical accepted form. """ _require_module() tokens = tokenize("prefix _italic_ suffix") @@ -2512,7 +2499,7 @@ def test_string_boundary_flanked_underscore_italic_tokenizes(self): Then: ITALIC('_foo_') is produced Start-of-string counts as whitespace-or-boundary flanking per - D-Open-21.2(a) cosai-simpler form. GREEN now and after SWE. + D-Open-21.2(a) cosai-simpler form. """ _require_module() tokens = tokenize("_foo_") @@ -2527,10 +2514,8 @@ def test_end_of_string_close_flanked_underscore_italic_tokenizes(self): Then: ITALIC('_foo_') is produced D-Open-21.2(a): end-of-string after the closing '_' qualifies as a - word boundary and therefore satisfies the whitespace-or-boundary - flanking requirement on the close side. This is GREEN now (the current - tokenizer produces the ITALIC token) and must stay GREEN after the SWE - underscore-tightening pass. + word boundary and satisfies the whitespace-or-boundary flanking + requirement on the close side. Empirically verified (2026-05-29): tokenize('some text _foo_') -> @@ -2551,8 +2536,7 @@ def test_double_underscore_still_produces_no_italic_or_bold(self): When: tokenize() is called Then: no BOLD and no ITALIC token - ADR-017 D1: __bold__ is NOT recognized. This must hold before and - after the SWE pass. GREEN now. + ADR-017 D1: __bold__ is NOT recognized. """ _require_module() tokens = tokenize("__double__") diff --git a/scripts/hooks/tests/test_validate_yaml_prose_subset.py b/scripts/hooks/tests/test_validate_yaml_prose_subset.py index 5dd1dd41..0d34e387 100644 --- a/scripts/hooks/tests/test_validate_yaml_prose_subset.py +++ b/scripts/hooks/tests/test_validate_yaml_prose_subset.py @@ -2026,7 +2026,7 @@ def test_flat_array_diagnostic_emits_single_bracket_only(self, tmp_path, capsys) # =========================================================================== -# TestLiveCorpusBaseline (Spike S2 — GREEN now, must stay GREEN after SWE) +# TestLiveCorpusBaseline (Spike S2) # =========================================================================== @@ -2035,10 +2035,8 @@ class TestLiveCorpusBaseline: r""" Spike S2: live-corpus regression baseline for the prose-subset linter. - Asserts that the current linter produces ZERO diagnostics across the four - content YAMLs in --block mode. This test is GREEN now and must remain - GREEN after the SWE pass (the new emphasis-rejection rules must not flag - anything in the corpus — confirmed by the Spike S3 probe before ADR-028 + Guards that the linter produces ZERO diagnostics across the four content + YAMLs in --block mode (confirmed by the Spike S3 probe before ADR-028 was flipped to Accepted). ADR-028 §9.3 Spike S2 — gates Phase 5 regression check. @@ -2065,7 +2063,7 @@ def test_risks_yaml_produces_zero_diagnostics(self): When: validate_yaml_prose_subset --block is run Then: exits 0 (zero diagnostics) - Baseline captured 2026-05-28. Must hold after SWE pass. + Baseline captured 2026-05-28. """ result = self._run_block("risks.yaml") assert result.returncode == 0, f"risks.yaml produced diagnostics:\n{result.stderr}" @@ -2103,11 +2101,9 @@ def test_personas_yaml_produces_zero_diagnostics(self): # =========================================================================== -# TestNestedEmphasisRejection (RED — depth-counter linter not yet) +# TestNestedEmphasisRejection — ADR-028 D5 depth-counter linter # =========================================================================== -# These tests assert the ADR-028 D5 depth-counter walk. check_prose_field() -# currently has no emphasis logic, so all these produce zero diagnostics now. -# After the SWE pass they go GREEN. +# These tests assert the ADR-028 D5 depth-counter walk in check_prose_field(). # =========================================================================== @@ -2118,9 +2114,7 @@ class TestNestedEmphasisRejection: Uses the same _make_field() idiom as TestSingleViolationDetection to build synthetic ProseField objects and call check_prose_field() directly. - All tests that assert a diagnostic are RED until the SWE pass adds the - depth-counter walk. Tests that assert zero diagnostics are GREEN now and - must stay GREEN (false-positive guard). + Tests that assert zero diagnostics guard against false positives. """ def _make_field( @@ -2147,7 +2141,7 @@ def _make_field( tokens=tokens, ) - # --- Tests that MUST produce a diagnostic (RED until SWE pass) --- + # --- Tests that MUST produce a diagnostic --- def test_nested_bold_produces_one_nested_emphasis_diagnostic(self): """ @@ -2165,15 +2159,10 @@ def test_nested_bold_produces_one_nested_emphasis_diagnostic(self): [open, text, close] that arrives at depth > 0 (depth==1 before the decrement), so the single diagnostic's snippet is the close token's value: "at '** bar**'". - - RED: check_prose_field has no emphasis logic yet. """ field = self._make_field("**foo **nested** bar**") diags = check_prose_field(field) - assert len(diags) == 1, ( - f"Expected 1 'nested emphasis' diagnostic, got {len(diags)}: {diags!r}. " - "RED until SWE adds depth-counter walk." - ) + assert len(diags) == 1, f"Expected 1 'nested emphasis' diagnostic, got {len(diags)}: {diags!r}." assert "nested emphasis" in diags[0].reason, ( f"Expected reason containing 'nested emphasis', got {diags[0].reason!r}" ) @@ -2187,15 +2176,11 @@ def test_nested_italic_produces_one_nested_emphasis_diagnostic(self): When: check_prose_field is called Then: ONE diagnostic with reason containing 'nested emphasis' - Same depth-counter logic for italic-asterisk delimiter. - RED until SWE pass. + Same depth-counter logic for italic-asterisk delimiter as for bold. """ field = self._make_field("*foo *nested* bar*") diags = check_prose_field(field) - assert len(diags) == 1, ( - f"Expected 1 diagnostic for nested italic, got {len(diags)}: {diags!r}. " - "RED until SWE adds depth-counter walk." - ) + assert len(diags) == 1, f"Expected 1 diagnostic for nested italic, got {len(diags)}: {diags!r}." assert "nested emphasis" in diags[0].reason def test_italic_after_open_bold_produces_nested_emphasis_diagnostic(self): @@ -2218,14 +2203,11 @@ def test_italic_after_open_bold_produces_nested_emphasis_diagnostic(self): ('A ') -> shape='open'; ITALIC('*B*') is a complete-shape token that arrives at depth==1 after the open bold. The diagnostic fires on the ITALIC token because it is a complete-emphasis token inside an open span. - - RED until the SWE pass adds the depth-counter walk. """ field = self._make_field("**A ** *B* C**") diags = check_prose_field(field) assert len(diags) == 1, ( - f"Expected 1 'nested emphasis' diagnostic for complete italic at depth>0, " - f"got {len(diags)}: {diags!r}. RED until SWE adds depth-counter walk." + f"Expected 1 'nested emphasis' diagnostic for complete italic at depth>0, got {len(diags)}: {diags!r}." ) assert "nested emphasis" in diags[0].reason, ( f"Expected reason containing 'nested emphasis', got {diags[0].reason!r}" @@ -2239,14 +2221,10 @@ def test_emphasis_wrapped_sentinel_intra_produces_diagnostic(self): ADR-028 D5: emphasis-wrapped-sentinel predicate fires when emphasis token interior (stripped) fullmatches the sentinel inner regex. - RED until SWE pass. """ field = self._make_field("**{{riskPromptInjection}}**") diags = check_prose_field(field) - assert len(diags) == 1, ( - f"Expected 1 'emphasis-wrapped sentinel' diagnostic, got {len(diags)}: {diags!r}. " - "RED until SWE adds emphasis-wrapped-sentinel predicate." - ) + assert len(diags) == 1, f"Expected 1 'emphasis-wrapped sentinel' diagnostic, got {len(diags)}: {diags!r}." assert "emphasis-wrapped sentinel" in diags[0].reason, ( f"Expected 'emphasis-wrapped sentinel' in reason, got {diags[0].reason!r}" ) @@ -2261,13 +2239,10 @@ def test_emphasis_wrapped_ref_sentinel_produces_diagnostic(self): SENTINEL_REF inner forms — the test strips the delimiter pair and fullmatches the tokenizer's two internal regexes, _RE_SENTINEL_INTRA_INNER and _RE_SENTINEL_REF_INNER. - RED until SWE pass. """ field = self._make_field("**{{ref:x}}**") diags = check_prose_field(field) - assert len(diags) == 1, ( - f"Expected 1 diagnostic for '**{{ref:x}}**', got {len(diags)}: {diags!r}. RED until SWE pass." - ) + assert len(diags) == 1, f"Expected 1 diagnostic for '**{{ref:x}}**', got {len(diags)}: {diags!r}." assert "emphasis-wrapped sentinel" in diags[0].reason def test_emphasis_wrapped_sentinel_with_newlines_produces_diagnostic(self): @@ -2279,16 +2254,13 @@ def test_emphasis_wrapped_sentinel_with_newlines_produces_diagnostic(self): ADR-028 D3: '**\\n**' has both-edges whitespace -> shape='open'. The emphasis-wrapped-sentinel predicate uses .strip() on the interior, so leading/trailing newlines do not prevent detection. - RED until SWE pass. """ field = self._make_field("**\n{{ref:x}}\n**") diags = check_prose_field(field) - assert len(diags) == 1, ( - f"Expected 1 diagnostic for newline-wrapped sentinel, got {len(diags)}: {diags!r}. RED until SWE pass." - ) + assert len(diags) == 1, f"Expected 1 diagnostic for newline-wrapped sentinel, got {len(diags)}: {diags!r}." assert "emphasis-wrapped sentinel" in diags[0].reason - # --- Tests that MUST produce ZERO diagnostics (GREEN now, faux-depth guard) --- + # --- Tests that MUST produce ZERO diagnostics (ADR-028 D5 faux-depth guard) --- def test_sibling_complete_bold_spans_produce_zero_diagnostics(self): """ @@ -2298,7 +2270,6 @@ def test_sibling_complete_bold_spans_produce_zero_diagnostics(self): ADR-028 D5 faux-depth guard: the two BOLD tokens have shape='complete' -> depth stays at 0 throughout -> no nested-emphasis diagnostic. - GREEN now and must stay GREEN after SWE pass. """ field = self._make_field("**hello** world **goodbye**") diags = check_prose_field(field) @@ -2315,7 +2286,6 @@ def test_sentinel_at_depth_zero_produces_zero_diagnostics(self): The sentinel is at depth==0 (outside any open emphasis). ADR-028 D5: the emphasis-wrapped-sentinel predicate checks the emphasis token's interior, not any subsequent sentinel at depth 0. - GREEN now and must stay GREEN after SWE pass. """ field = self._make_field("**hello** world {{ref:x}}") diags = check_prose_field(field) @@ -2328,7 +2298,6 @@ def test_clean_bold_produces_zero_diagnostics(self): Then: ZERO diagnostics Simple complete-shape bold — no nesting, no sentinel inside. - GREEN now and must stay GREEN. """ field = self._make_field("**bold**") diags = check_prose_field(field) @@ -2363,7 +2332,6 @@ def test_bold_containing_italic_produces_zero_diagnostics(self): ADR-017 D1: italic inside bold is one permitted nesting level. The tokenizer emits a single BOLD token for this span (italic-in-bold is absorbed atomically). No depth-counter violation. - GREEN now and must stay GREEN. """ field = self._make_field("**bold *italic* inside**") diags = check_prose_field(field) @@ -2371,9 +2339,9 @@ def test_bold_containing_italic_produces_zero_diagnostics(self): # =========================================================================== -# TestEmphasisDiagnosticFormat (RED — diagnostic format for new reasons) +# TestEmphasisDiagnosticFormat — ADR-028 D6 diagnostic format locks # =========================================================================== -# Locks the exact diagnostic format strings for the two new reason constants +# Locks the exact diagnostic format strings for the two reason constants: # ADR-028 D6: 'nested emphasis' and 'emphasis-wrapped sentinel', plus the # token-snippet convention ('at '). # =========================================================================== @@ -2389,8 +2357,6 @@ class TestEmphasisDiagnosticFormat: The for emphasis violations follows the existing 'at ''' pattern: the reason string ends with "at ''" where token.value is the offending emphasis token's full value (including delimiters). - - All tests are RED until the SWE pass adds the depth-counter walk. """ def _make_field( @@ -2427,11 +2393,10 @@ def test_nested_emphasis_diagnostic_reason_string(self): convention follows the existing INVALID_* pattern: "at ''". The offending token is the BOLD('** bar**') (the 'close'-shape token at depth > 0). - RED until SWE pass. """ field = self._make_field("**foo **nested** bar**") diags = check_prose_field(field) - assert len(diags) == 1, f"Expected 1 diagnostic, got {diags!r}. RED until SWE pass." + assert len(diags) == 1, f"Expected 1 diagnostic, got {diags!r}." reason = diags[0].reason assert reason.startswith("nested emphasis"), f"Reason must start with 'nested emphasis', got {reason!r}" assert "at '** bar**'" in reason, f"Reason must contain \"at '** bar**'\", got {reason!r}" @@ -2443,7 +2408,6 @@ def test_nested_emphasis_format_diagnostic_line(self): Then: output matches ADR-017 D4 format with 'nested emphasis' reason Asserts the full committed format string including hook_id prefix. - RED until SWE pass. """ from precommit._linter_types import format_diagnostic_line # noqa: PLC0415 @@ -2454,7 +2418,7 @@ def test_nested_emphasis_format_diagnostic_line(self): index=0, ) diags = check_prose_field(field) - assert len(diags) == 1, f"Expected 1 diagnostic. RED until SWE pass. Got: {diags!r}" + assert len(diags) == 1, f"Expected 1 diagnostic, got {diags!r}." line = format_diagnostic_line(diags[0]) # Format: validate-yaml-prose-subset: test.yaml:riskAlpha:shortDescription[0]: nested emphasis at '...' assert line.startswith("validate-yaml-prose-subset: "), f"Expected hook_id prefix, got {line!r}" @@ -2471,11 +2435,10 @@ def test_emphasis_wrapped_sentinel_diagnostic_reason_string(self): ADR-028 D6: reason string is 'emphasis-wrapped sentinel'; snippet is the full BOLD token value '**{{riskPromptInjection}}**'. - RED until SWE pass. """ field = self._make_field("**{{riskPromptInjection}}**") diags = check_prose_field(field) - assert len(diags) == 1, f"Expected 1 diagnostic. RED until SWE pass. Got: {diags!r}" + assert len(diags) == 1, f"Expected 1 diagnostic, got {diags!r}." reason = diags[0].reason assert reason.startswith("emphasis-wrapped sentinel"), ( f"Reason must start with 'emphasis-wrapped sentinel', got {reason!r}" @@ -2489,7 +2452,7 @@ def test_emphasis_wrapped_sentinel_format_diagnostic_line_matches_pattern(self): Then: output matches the committed _DIAG_PATTERN regex Verifies the emphasis violation slots into the existing format contract - without modifying the pattern. RED until SWE pass. + without modifying the pattern. """ from precommit._linter_types import format_diagnostic_line # noqa: PLC0415 @@ -2500,7 +2463,7 @@ def test_emphasis_wrapped_sentinel_format_diagnostic_line_matches_pattern(self): index=1, ) diags = check_prose_field(field) - assert len(diags) == 1, f"Expected 1 diagnostic. RED until SWE pass. Got: {diags!r}" + assert len(diags) == 1, f"Expected 1 diagnostic, got {diags!r}." line = format_diagnostic_line(diags[0]) assert _DIAG_PATTERN.match(line), ( f"Emphasis-wrapped-sentinel diagnostic does not match committed pattern: {line!r}" From 15dd2d8b49ec2e1862e2fe43510dbe10f60db926 Mon Sep 17 00:00:00 2001 From: davidlabianca Date: Fri, 29 May 2026 19:33:38 +0000 Subject: [PATCH 8/9] This commit adds the ADR-028 D5 third erratum (greenfield framing + double-emit note) Independent architecture review flagged D5's 'are deleted' / Consequences 'linter shrinks' as false against this clean base: the named helpers never existed here (D-Open-11; they lived on the abandoned feature/353-c4-followons). The Addendum reframes that language as logical supersession of the archived design, not a diff, and documents that the nested-emphasis and wrapped-sentinel predicates are independent and may both fire on one token. Doc-only; no code or section 8 lock change; Status stays Accepted. Co-authored-by: AI Assistant --- docs/adr/028-prose-linter-bracket-matching-architecture.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/adr/028-prose-linter-bracket-matching-architecture.md b/docs/adr/028-prose-linter-bracket-matching-architecture.md index ccc1c80d..17115052 100644 --- a/docs/adr/028-prose-linter-bracket-matching-architecture.md +++ b/docs/adr/028-prose-linter-bracket-matching-architecture.md @@ -146,6 +146,8 @@ Reason strings (`_REASON_NESTED_EMPHASIS`, `_REASON_EMPHASIS_WRAPPED_SENTINEL`) **Addendum (2026-05-29) — second erratum.** The original D5 Emphasis-wrapped-sentinel predicate bullet named a single unified `SENTINEL_INNER_RE` "defined once in `_prose_tokens.py` and shared with the references linter." Both claims were false against the as-built code. No `SENTINEL_INNER_RE` exists; the tokenizer carries two internal constants, `_RE_SENTINEL_INTRA_INNER` and `_RE_SENTINEL_REF_INNER` (`_prose_tokens.py:174-175`), which the prose-subset linter imports directly with an inline coupling comment (`validate_yaml_prose_subset.py:39-45`). That direct import of internal `_RE_*` constants is exactly the coupling D4 sanctions (the `_RE_*` constants are internal and reorganizable), and a single public `SENTINEL_INNER_RE` would have been a 4th public name contradicting D4's "exactly three names" surface. The constants are **not** shared with the references linter, which resolves sentinels structurally via `_resolve_intra_sentinel` (prefix + id-index, dispatched on `token.kind`) rather than by inner-regex match. The predicate bullet above is corrected to describe this reality. This is a doc-accuracy erratum only: the implementation already matches the corrected text, no code changes, no §8 lock changes, and Status remains **Accepted**. +**Addendum (2026-05-29) — third erratum.** The "are deleted" framing in the D5 prose above (the named helpers `_detect_nested_emphasis_indices` and `_is_emphasis_wrapped_sentinel`, plus the three `_RE_*_EARLY_CLOSE` and three `_RE_*_WRAPPED_SENTINEL` constants) and the matching "linter shrinks" bullet in Consequences describe a delete-and-replace diff that did not occur on this branch. This is a fresh branch off clean `upstream/main = 7320136` per D-Open-11; none of those symbols ever existed on this base. The prior emphasis-rejection layer they describe lived only on the abandoned archive branch `feature/353-c4-followons`. The as-built is therefore a greenfield addition of the depth-walk emphasis enforcement, not a reduction: the "are deleted" / "linter shrinks" language records the logical supersession of that archived design, not a diff against this branch's base. The two predicates are also independent and may both fire on a single emphasis token — a `close`-shape token at `depth > 0` whose stripped interior matches a sentinel-inner regex emits **both** the `nested emphasis` and the `emphasis-wrapped sentinel` diagnostics (e.g. `**foo **{{ref:x}}** bar**`); the wrapped-sentinel predicate runs unconditionally on every emphasis token while the nested-emphasis predicate gates on depth, so the two are orthogonal and the double-emit is intended. This is a doc-accuracy erratum only: no code change, no §8 lock change, and Status remains **Accepted**. + ### D6. Diagnostic conformance ADR-017 D4's diagnostic format spec is preserved byte-for-byte: From 121f14e54540d2c7061786508e6fbea53256d368 Mon Sep 17 00:00:00 2001 From: davidlabianca Date: Fri, 29 May 2026 19:33:47 +0000 Subject: [PATCH 9/9] This commit addresses the independent-review MEDIUMs for the ADR-028 prose linter Fail-loud guard in _delim_for_token (raises ValueError on a non-emphasis value instead of silently returning '_'); 6 additive emphasis-shape fixtures under accepting/emphasis_shapes/ honoring D-Open-18 Path 3b (open/close/both-edges/nested/sentinel-wrapped token streams); a characterization test pinning the intended nested+wrapped double-emit; and corrected the stale 'tokenizer/fixture-dir NOT modified' docstring plus the fixture-pair count. pytest 2588/6, ruff clean, both prose linters --block exit 0. Co-authored-by: AI Assistant --- .../precommit/validate_yaml_prose_subset.py | 17 ++++- .../bold_both_edges_whitespace.tokens.json | 1 + .../bold_both_edges_whitespace.txt | 1 + .../emphasis_shapes/bold_close.tokens.json | 1 + .../accepting/emphasis_shapes/bold_close.txt | 1 + .../emphasis_shapes/bold_open.tokens.json | 1 + .../accepting/emphasis_shapes/bold_open.txt | 1 + .../bold_wraps_sentinel_nested.tokens.json | 1 + .../bold_wraps_sentinel_nested.txt | 1 + .../italic_asterisk_open.tokens.json | 1 + .../emphasis_shapes/italic_asterisk_open.txt | 1 + .../nested_bold_three_token.tokens.json | 1 + .../nested_bold_three_token.txt | 1 + scripts/hooks/tests/test_prose_tokens.py | 45 +++++++++++- .../tests/test_validate_yaml_prose_subset.py | 69 ++++++++++++++++++- 15 files changed, 137 insertions(+), 6 deletions(-) create mode 100644 scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_both_edges_whitespace.tokens.json create mode 100644 scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_both_edges_whitespace.txt create mode 100644 scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_close.tokens.json create mode 100644 scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_close.txt create mode 100644 scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_open.tokens.json create mode 100644 scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_open.txt create mode 100644 scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_wraps_sentinel_nested.tokens.json create mode 100644 scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_wraps_sentinel_nested.txt create mode 100644 scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/italic_asterisk_open.tokens.json create mode 100644 scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/italic_asterisk_open.txt create mode 100644 scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/nested_bold_three_token.tokens.json create mode 100644 scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/nested_bold_three_token.txt diff --git a/scripts/hooks/precommit/validate_yaml_prose_subset.py b/scripts/hooks/precommit/validate_yaml_prose_subset.py index dcab341a..34ad1bb5 100644 --- a/scripts/hooks/precommit/validate_yaml_prose_subset.py +++ b/scripts/hooks/precommit/validate_yaml_prose_subset.py @@ -138,19 +138,30 @@ def _delim_for_token(token_value: str) -> str: """Return the delimiter prefix for an emphasis token value. Inspects the leading characters to distinguish '**' (BOLD) from '*' (ITALIC - asterisk) from '_' (ITALIC underscore). + asterisk) from '_' (ITALIC underscore). Called only on BOLD/ITALIC tokens, + whose values always start with one of those delimiters. Args: - token_value: The full token value string. + token_value: The full token value string (a BOLD or ITALIC token). Returns: The delimiter string: '**', '*', or '_'. + + Raises: + ValueError: if token_value does not start with '**', '*', or '_'. The + helper fails loud rather than guessing a delimiter, so a future + emphasis kind that reaches it with an unhandled delimiter surfaces + immediately instead of silently mis-slicing the token interior. """ if token_value.startswith("**"): return "**" if token_value.startswith("*"): return "*" - return "_" + if token_value.startswith("_"): + return "_" + raise ValueError( + f"_delim_for_token expects a BOLD/ITALIC token value starting with '**', '*', or '_'; got {token_value!r}" + ) def check_prose_field(field: ProseField) -> list[Diagnostic]: diff --git a/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_both_edges_whitespace.tokens.json b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_both_edges_whitespace.tokens.json new file mode 100644 index 00000000..d6056767 --- /dev/null +++ b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_both_edges_whitespace.tokens.json @@ -0,0 +1 @@ +[{"kind": "BOLD", "value": "** foo **"}] diff --git a/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_both_edges_whitespace.txt b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_both_edges_whitespace.txt new file mode 100644 index 00000000..1619c8c3 --- /dev/null +++ b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_both_edges_whitespace.txt @@ -0,0 +1 @@ +** foo ** \ No newline at end of file diff --git a/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_close.tokens.json b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_close.tokens.json new file mode 100644 index 00000000..47427de5 --- /dev/null +++ b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_close.tokens.json @@ -0,0 +1 @@ +[{"kind": "BOLD", "value": "** bar**"}] diff --git a/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_close.txt b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_close.txt new file mode 100644 index 00000000..0aa56ec1 --- /dev/null +++ b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_close.txt @@ -0,0 +1 @@ +** bar** \ No newline at end of file diff --git a/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_open.tokens.json b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_open.tokens.json new file mode 100644 index 00000000..e53ba84e --- /dev/null +++ b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_open.tokens.json @@ -0,0 +1 @@ +[{"kind": "BOLD", "value": "**foo **"}] diff --git a/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_open.txt b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_open.txt new file mode 100644 index 00000000..e7134b12 --- /dev/null +++ b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_open.txt @@ -0,0 +1 @@ +**foo ** \ No newline at end of file diff --git a/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_wraps_sentinel_nested.tokens.json b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_wraps_sentinel_nested.tokens.json new file mode 100644 index 00000000..e05ff617 --- /dev/null +++ b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_wraps_sentinel_nested.tokens.json @@ -0,0 +1 @@ +[{"kind": "BOLD", "value": "**x **"}, {"kind": "TEXT", "value": "y"}, {"kind": "BOLD", "value": "**{{ref:x}}**"}] diff --git a/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_wraps_sentinel_nested.txt b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_wraps_sentinel_nested.txt new file mode 100644 index 00000000..9328e3d4 --- /dev/null +++ b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/bold_wraps_sentinel_nested.txt @@ -0,0 +1 @@ +**x **y**{{ref:x}}** \ No newline at end of file diff --git a/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/italic_asterisk_open.tokens.json b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/italic_asterisk_open.tokens.json new file mode 100644 index 00000000..4711774f --- /dev/null +++ b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/italic_asterisk_open.tokens.json @@ -0,0 +1 @@ +[{"kind": "ITALIC", "value": "*foo *"}] diff --git a/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/italic_asterisk_open.txt b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/italic_asterisk_open.txt new file mode 100644 index 00000000..c9628810 --- /dev/null +++ b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/italic_asterisk_open.txt @@ -0,0 +1 @@ +*foo * \ No newline at end of file diff --git a/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/nested_bold_three_token.tokens.json b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/nested_bold_three_token.tokens.json new file mode 100644 index 00000000..6951e13f --- /dev/null +++ b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/nested_bold_three_token.tokens.json @@ -0,0 +1 @@ +[{"kind": "BOLD", "value": "**foo **"}, {"kind": "TEXT", "value": "nested"}, {"kind": "BOLD", "value": "** bar**"}] diff --git a/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/nested_bold_three_token.txt b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/nested_bold_three_token.txt new file mode 100644 index 00000000..60fdb6f2 --- /dev/null +++ b/scripts/hooks/tests/fixtures/prose_subset/accepting/emphasis_shapes/nested_bold_three_token.txt @@ -0,0 +1 @@ +**foo **nested** bar** \ No newline at end of file diff --git a/scripts/hooks/tests/test_prose_tokens.py b/scripts/hooks/tests/test_prose_tokens.py index 8b61e6c8..e4cf5712 100644 --- a/scripts/hooks/tests/test_prose_tokens.py +++ b/scripts/hooks/tests/test_prose_tokens.py @@ -87,8 +87,9 @@ Test Summary ============ -Total fixture-parametrized pairs: 55 +Total fixture-parametrized pairs: 62 - accepting/: 7 fixture pairs (inc. double_underscore_not_bold) +- accepting/emphasis_shapes/: 6 fixture pairs (ADR-028 D-Open-18, Path 3b) - sentinels/: 7 fixture pairs - rejecting/: 16 fixture pairs (existing) + 14 new URL fixture pairs (commit 5) - folded_bullets/: 2 fixture pairs @@ -345,6 +346,23 @@ def test_tokenize_returns_list(self): "accepting/double_underscore_not_bold", ] +# ADR-028 D-Open-18 (Path 3b): additive emphasis-shape fixtures. These lock the +# tokenizer's token STREAM (kind + value) for the open / close / both-edges / +# nested emphasis inputs the shape classifier (ADR-028 D3) tags. The fixture +# projection drops `shape` per D-Open-20, so shape itself is asserted directly +# in the classifier tests; these fixtures lock the underlying greedy-match +# stream the classifier and the depth-counter linter depend on (e.g. that +# `**foo **nested** bar**` splits into [BOLD, TEXT, BOLD], the precondition for +# the nested-emphasis diagnostic). +_EMPHASIS_SHAPE_FIXTURES = [ + "accepting/emphasis_shapes/bold_open", + "accepting/emphasis_shapes/bold_close", + "accepting/emphasis_shapes/bold_both_edges_whitespace", + "accepting/emphasis_shapes/italic_asterisk_open", + "accepting/emphasis_shapes/nested_bold_three_token", + "accepting/emphasis_shapes/bold_wraps_sentinel_nested", +] + _SENTINEL_FIXTURES = [ "sentinels/intra_risk", "sentinels/intra_control", @@ -457,6 +475,31 @@ def test_no_invalid_tokens(self, fixture_path: str): assert token.kind not in invalid_kinds, f"Fixture {fixture_path!r}: unexpected INVALID token {token!r}" +class TestEmphasisShapeFixtures: + """ + Verify the token stream for emphasis-shape inputs (ADR-028 D-Open-18, Path 3b). + + Given: an emphasis input exercising open / close / both-edges / nested shapes + When: tokenize() is called + Then: the stream matches the fixture's .tokens.json exactly (kind + value). + + These fixtures lock the greedy-match behaviour the shape classifier and the + depth-counter linter depend on; `shape` is dropped by the fixture projection + (D-Open-20) and asserted directly in the classifier tests. + """ + + @pytest.mark.parametrize("fixture_path", _EMPHASIS_SHAPE_FIXTURES) + def test_emphasis_shape_token_stream(self, fixture_path: str): + """ + Given: input from accepting/emphasis_shapes/.txt + When: tokenize() is called + Then: output matches the fixture's .tokens.json exactly (kind + value) + """ + input_text, expected = _load_fixture_pair(fixture_path) + result = _tokens_to_dicts(tokenize(input_text)) + assert result == expected, f"Fixture {fixture_path!r}: expected {expected!r}, got {result!r}" + + class TestSentinels: """ Verify sentinel tokenisation for both intra-document ({{riskXxx}}, {{controlXxx}}, diff --git a/scripts/hooks/tests/test_validate_yaml_prose_subset.py b/scripts/hooks/tests/test_validate_yaml_prose_subset.py index 0d34e387..cb0cae01 100644 --- a/scripts/hooks/tests/test_validate_yaml_prose_subset.py +++ b/scripts/hooks/tests/test_validate_yaml_prose_subset.py @@ -43,8 +43,10 @@ — reference_violations/ : (used by the references linter only) — schemas/ : minimal mock schemas for introspection tests -The tokenizer (_prose_tokens.py, locked at 25e3d22) is NOT modified. -The prose_subset/ fixture directory is NOT modified. +The tokenizer (_prose_tokens.py) is extended on this branch per ADR-028 D1/D3 +(the Token.shape field, _classify_emphasis_shape, and the tightened +_RE_ITALIC_UNDERSCORE). The prose_subset/ fixture directory gains the +accepting/emphasis_shapes/ pairs (ADR-028 D-Open-18, Path 3b). Test Coverage ============= @@ -93,6 +95,7 @@ from validate_yaml_prose_subset import ( # noqa: E402 Diagnostic, ProseField, + _delim_for_token, check_prose_field, find_prose_fields, main, @@ -104,6 +107,7 @@ # Stub names so module-level references do not raise NameError at load time. Diagnostic = None # type: ignore[assignment,misc] ProseField = None # type: ignore[assignment,misc] + _delim_for_token = None # type: ignore[assignment] check_prose_field = None # type: ignore[assignment] find_prose_fields = None # type: ignore[assignment] main = None # type: ignore[assignment] @@ -2337,6 +2341,27 @@ def test_bold_containing_italic_produces_zero_diagnostics(self): diags = check_prose_field(field) assert len(diags) == 0, f"Bold-with-italic-inside must produce 0 diagnostics. Got: {diags!r}" + def test_nested_bold_wrapping_sentinel_emits_both_diagnostics(self): + """ + Given: '**x **y**{{ref:x}}**' -> [BOLD(open), TEXT, BOLD(complete '**{{ref:x}}**')] + When: check_prose_field is called + Then: the trailing BOLD token emits BOTH a 'nested emphasis' AND an + 'emphasis-wrapped sentinel' diagnostic (two diagnostics total). + + ADR-028 D5 (Addendum 2026-05-29, third erratum): the nested-emphasis and + wrapped-sentinel predicates are independent and may both fire on a single + emphasis token. The complete token '**{{ref:x}}**' arrives at depth > 0 + (nested, via the preceding open '**x **') and its stripped interior is a + sentinel (wrapped). This pins the intended double-emit so a future change + cannot silently collapse it to one diagnostic. + """ + field = self._make_field("**x **y**{{ref:x}}**") + diags = check_prose_field(field) + reasons = [d.reason for d in diags] + assert len(diags) == 2, f"Expected exactly 2 diagnostics (nested + wrapped). Got: {diags!r}" + assert "nested emphasis at '**{{ref:x}}**'" in reasons, reasons + assert "emphasis-wrapped sentinel at '**{{ref:x}}**'" in reasons, reasons + # =========================================================================== # TestEmphasisDiagnosticFormat — ADR-028 D6 diagnostic format locks @@ -2468,3 +2493,43 @@ def test_emphasis_wrapped_sentinel_format_diagnostic_line_matches_pattern(self): assert _DIAG_PATTERN.match(line), ( f"Emphasis-wrapped-sentinel diagnostic does not match committed pattern: {line!r}" ) + + +# =========================================================================== +# TestDelimForTokenGuard — _delim_for_token fail-loud contract +# =========================================================================== +# _delim_for_token is called only on BOLD/ITALIC tokens, so its input always +# starts with '**', '*', or '_'. It must fail loud on any other value rather +# than silently returning a wrong delimiter (which would make +# _is_emphasis_wrapped_sentinel slice the wrong interior). +# =========================================================================== + + +class TestDelimForTokenGuard: + """Tests for the _delim_for_token delimiter-dispatch helper (ADR-028 D5).""" + + def test_bold_delimiter(self): + """'**...**' values return the two-character bold delimiter.""" + assert _delim_for_token("**foo**") == "**" + + def test_italic_asterisk_delimiter(self): + """'*...*' values return the single asterisk delimiter.""" + assert _delim_for_token("*foo*") == "*" + + def test_italic_underscore_delimiter(self): + """'_..._' values return the underscore delimiter.""" + assert _delim_for_token("_foo_") == "_" + + def test_unrecognized_value_raises(self): + """ + A value that is not a BOLD/ITALIC token (no '**', '*', or '_' prefix) + must raise rather than silently returning a delimiter. Guards against a + future emphasis kind reaching the helper with an unhandled delimiter. + """ + with pytest.raises(ValueError): + _delim_for_token("plain text") + + def test_empty_value_raises(self): + """An empty string is not a valid emphasis token value and must raise.""" + with pytest.raises(ValueError): + _delim_for_token("")