Skip to content
This repository was archived by the owner on Apr 29, 2026. It is now read-only.

Add typed exception hierarchy and regression-corpus scaffolding#17

Merged
Tatarinho merged 1 commit into
mainfrom
typed-errors-and-corpus-scaffolding
Apr 26, 2026
Merged

Add typed exception hierarchy and regression-corpus scaffolding#17
Tatarinho merged 1 commit into
mainfrom
typed-errors-and-corpus-scaffolding

Conversation

@Tatarinho
Copy link
Copy Markdown
Owner

Typed exception hierarchy under llm_safe_pl.errors:

  • LlmSafeError (base, subclass of Exception)
  • MappingError, InputSizeError (also subclass ValueError so existing except ValueError code keeps catching)
  • DetectorError (also subclass RuntimeError; constructor signature is (detector_name) only — never input text or cause, to prevent PII from leaking into stack traces)

Mapping.from_dict / from_json and the Shield input-size guard now raise the typed classes instead of bare ValueError. Constructor-time argument validation (Shield(max_input_bytes=-1), duplicate detector names) keeps raising plain ValueError.

Regression-corpus scaffolding under tests/corpora/:

  • pl_pii_positive/ with three labeled samples (PESEL, email, IBAN)
  • pl_pii_negative/ with two safe-text samples
  • tests/test_corpus.py discovers .txt/.json pairs at collection time, asserts the default Shield matches every labeled span in positive samples and finds zero matches in negative samples
  • CONTRIBUTING.md gains an "Adding to the regression corpus" section with the JSON schema and offset rules

345 tests pass, coverage 96.50%, ruff/mypy clean.

Typed exception hierarchy under llm_safe_pl.errors:

- LlmSafeError (base, subclass of Exception)
- MappingError, InputSizeError (also subclass ValueError so existing
  `except ValueError` code keeps catching)
- DetectorError (also subclass RuntimeError; constructor signature is
  (detector_name) only — never input text or cause, to prevent PII
  from leaking into stack traces)

Mapping.from_dict / from_json and the Shield input-size guard now raise
the typed classes instead of bare ValueError. Constructor-time argument
validation (Shield(max_input_bytes=-1), duplicate detector names) keeps
raising plain ValueError.

Regression-corpus scaffolding under tests/corpora/:

- pl_pii_positive/ with three labeled samples (PESEL, email, IBAN)
- pl_pii_negative/ with two safe-text samples
- tests/test_corpus.py discovers .txt/.json pairs at collection time,
  asserts the default Shield matches every labeled span in positive
  samples and finds zero matches in negative samples
- CONTRIBUTING.md gains an "Adding to the regression corpus" section
  with the JSON schema and offset rules

345 tests pass, coverage 96.50%, ruff/mypy clean.
@Tatarinho Tatarinho merged commit 2babf5c into main Apr 26, 2026
7 checks passed
Tatarinho pushed a commit that referenced this pull request Apr 29, 2026
…folding

Resolves the CHANGELOG.md conflict between main's [Unreleased] entries
(typed-exception hierarchy in llm_safe_pl.errors and tests/corpora/
regression scaffolding from PR #17) and the v0.2.1 deprecation tombstone
that was branched directly from v0.2.0.

CHANGELOG resolution: kept the v0.2.1 tombstone entry verbatim and
replaced main's [Unreleased] block with an end-of-life note explaining
that the typed-exception code is preserved in the source tree but
will not ship to PyPI (v0.2.1 is final). Equivalent functionality lives
in pii-veil.errors and pii-core.

src/llm_safe_pl/__init__.py: auto-merged cleanly. The post-merge file
imports the typed exception classes (LlmSafeError, MappingError,
InputSizeError, DetectorError) AND fires the v0.2.1 DeprecationWarning
on import. Anyone building from the archived repo source gets both;
PyPI users on 0.2.1 only get the deprecation warning.

Other files came in cleanly from main: src/llm_safe_pl/errors.py (new),
docs/errors.md (new), tests/test_errors.py (new), tests/test_corpus.py
(new), tests/corpora/ fixtures (new), and the typed-error refactors in
models.py / shield.py / test_public_api.py / test_security_hardening.py.

Local CI gates green on the merged tree:
  ruff check .             -> All checks passed
  ruff format --check .    -> 60 files already formatted
  mypy                     -> no issues found in 24 source files
  pytest -q                -> 347 passed, 96.51% coverage
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant