Thank you for your interest in contributing to phi-redactor! This project helps healthcare organizations safely use LLMs without risking PHI exposure, and every contribution makes that mission stronger.
- Code of Conduct
- Getting Started
- Development Setup
- Making Changes
- Pull Request Process
- Code Standards
- Testing
- Reporting Issues
- Security Vulnerabilities
This project follows the Contributor Covenant Code of Conduct. By participating, you are expected to uphold this code.
- Fork the repository on GitHub
- Clone your fork locally
- Create a branch for your changes
- Make your changes with tests
- Submit a pull request
# Clone your fork
git clone https://github.com/YOUR_USERNAME/phi-redactor.git
cd phi-redactor
# Create a virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install in development mode with dev dependencies
pip install -e ".[dev]"
# Download the spaCy model
python -m spacy download en_core_web_lg
# Verify everything works
pytest- Python 3.11+
- Git
Use descriptive branch names:
feat/add-azure-adapter— new featuresfix/vault-session-cleanup— bug fixesdocs/improve-quickstart— documentationtest/add-streaming-tests— test additions
Follow Conventional Commits:
feat(detection): add custom recognizer for DEA numbers
fix(vault): prevent FK constraint error on session creation
test(proxy): add streaming round-trip tests for Anthropic
docs: update API endpoint table in README
- Update tests — all new features need tests; bug fixes need regression tests
- Run the full test suite —
pytestmust pass - Run linting —
ruff check src/ tests/must pass - Run formatting —
ruff format src/ tests/ - Update documentation if your change affects the public API or CLI
- Keep PRs focused — one feature or fix per PR; avoid unrelated changes
- Fill out the PR template — describe what changed and why
- All CI checks pass (tests on Python 3.11, 3.12, 3.13)
- No decrease in test coverage for changed files
- Code follows existing patterns and conventions
- PHI safety invariants are preserved (see below)
These are non-negotiable — any PR that violates these will be rejected:
- PHI must never be logged — use the PHI-safe log formatter
- PHI must never leave the local machine unredacted — fail-safe: block, never leak
- Vault entries must be encrypted at rest — Fernet encryption required
- Audit trail must be append-only — hash-chain integrity must be preserved
- Sessions must be isolated — no cross-session data leakage
- Formatter:
ruff format - Linter:
ruff check - Type hints: Use type annotations for all public functions
- Docstrings: Required for public APIs; use Google style
src/phi_redactor/
├── detection/ # PHI detection engine and recognizers
├── masking/ # Semantic masking and identity generation
├── vault/ # Encrypted storage for PHI mappings
├── proxy/ # FastAPI reverse proxy and adapters
├── audit/ # Tamper-evident audit trail
├── cli/ # Click-based CLI commands
├── dashboard/ # Real-time monitoring web UI
├── plugins/ # Plugin loader and examples
├── config.py # Configuration management
└── models.py # Shared data models
# Run all tests
pytest
# Run with verbose output
pytest -v
# Run a specific test file
pytest tests/test_detection.py
# Run tests matching a pattern
pytest -k "test_ssn"- Place tests in the
tests/directory - Name test files
test_*.py - Use
pytestfixtures for shared setup - Use
pytest-asynciofor async tests - Use
pytest-httpxfor HTTP mocking - Test both happy paths and error cases
- For PHI detection tests, include realistic but synthetic examples
Use the bug report template and include:
- Python version and OS
- Steps to reproduce
- Expected vs actual behavior
- Relevant logs (ensure no real PHI is included!)
Use the feature request template and describe:
- The problem you're trying to solve
- Your proposed solution
- Alternatives you've considered
Do NOT open a public issue for security vulnerabilities.
See SECURITY.md for responsible disclosure instructions.
By contributing, you agree that your contributions will be licensed under the Apache License 2.0.