This directory contains all configuration files for the immunization pipeline. Each file has a specific purpose and is used at different stages of the pipeline.
- Data Flow Through Configuration Files
- Required Configuration Files
- QR Code Configuration
- PDF Validation Configuration
- PDF Encryption Configuration
- 🏷️ Template Field Reference
- Adding New Configurations
Raw Input (from CSV/Excel)
↓
[preprocess.py]
├─ disease_normalization.json → normalize variants
├─ vaccine_reference.json → expand vaccines to diseases
├─ parameters.yaml.chart_diseases_header → filter diseases not in chart → "Other"
└─ Emit artifact with filtered disease names
↓
Artifact JSON (canonical English disease names, filtered by chart config)
↓
[generate_notices.py]
├─ parameters.yaml.chart_diseases_header → load chart disease list
├─ translations/{lang}_diseases_chart.json → translate each disease name
├─ translations/{lang}_diseases_overdue.json → translate vaccines_due list
└─ Inject translated diseases into Typst template
↓
Typst Files (with localized, filtered disease names)
↓
[compile_notices.py]
└─ Generate PDFs
↓
[validate_pdfs.py]
└─ Validate PDFs (page counts, layout markers) and emit validation JSON
Purpose: Pipeline behavior configuration (feature flags, settings, and chart disease filtering)
Usage:
- QR code generation settings
- PDF encryption settings
- Batching configuration
- Date controls for data freshness and eligibility logic
- Chart disease selection via
chart_diseases_header(CRITICAL)
These are the most commonly adjusted options in parameters.yaml:
qr.enabled: Enable or disable QR code generation (true/false)encryption.enabled: Enable or disable PDF encryption (true/false)bundling.bundle_size: Enable bundling with at most N clients per bundle (0 disables bundling)bundling.group_by: Bundle grouping strategy (null for sequential,school, orboard)
The pipeline has two lifecycle phases controlled under pipeline.*:
Before Run (pipeline.before_run):
clear_output_directory: When true, removes all output except logs before starting a new run. Preserves the logs directory for audit trail. Set to true for clean re-runs; false to prompt before deleting.
After Run (pipeline.after_run):
remove_artifacts: When true, removes theoutput/artifactsdirectory (QR codes, Typst files). Use this to reclaim disk space after successful compilation and validation.remove_unencrypted_pdfs: When true and either encryption OR batching is enabled, removes non-encrypted PDFs fromoutput/pdf_individual/after encryption/batching completes. When both encryption and batching are disabled, individual non-encrypted PDFs are assumed to be the final output and are preserved regardless of this setting.
date_data_cutoff(ISO 8601 string) records when the source data was extracted. It renders in notices using the client's language via Babel so that readers see a localized calendar date. Change this only when regenerating notices from a fresher extract.date_notice_delivery(ISO 8601 string) fixes the reference point for age-based eligibility checks and QR payloads. Preprocessing uses this value to decide if a client is 16 or older, so adjust it cautiously and keep it aligned with the actual delivery or mailing date.
chart_diseases_header Configuration:
This list defines which diseases appear as columns in the immunization chart:
chart_diseases_header:
- Diphtheria
- Tetanus
- Pertussis
- Polio
- Hib
- Pneumococcal
- Rotavirus
- Measles
- Mumps
- Rubella
- Meningococcal
- Varicella
- OtherDisease Filtering and "Other" Category:
-
During Preprocessing (
preprocess.py):- Diseases from vaccine records are checked against
chart_diseases_header - Diseases not in the list are collapsed into "Other"
- This ensures only configured diseases appear as separate columns
- Diseases from vaccine records are checked against
-
During Notice Generation (
generate_notices.py):- Each disease name in
chart_diseases_headeris translated to the target language - Translations come from
translations/{lang}_diseases_chart.json - Translated list is passed to Typst template
- The template renders column headers using Python-translated names, not raw config values
- Each disease name in
Impact:
- Chart columns only show diseases in this list
- Unplanned/unexpected diseases are grouped under "Other"
- All column headers are properly localized before template rendering
- No runtime lookups needed in Typst; translations applied in Python
Purpose: Maps vaccine codes to the diseases they protect against (canonical disease names)
Format:
{
"VACCINE_CODE": ["Disease1", "Disease2", ...],
...
}Usage:
- Loaded in
orchestrator.pystep 2 (preprocessing) - Used in
preprocess.py:enrich_grouped_records()expands vaccine codes to disease names- Maps received vaccine records to canonical disease names
- All disease names MUST be canonical (English) forms
Example:
{
"DTaP": ["Diphtheria", "Tetanus", "Pertussis"],
"IPV": ["Polio"],
"MMR": ["Measles", "Mumps", "Rubella"]
}Canonical diseases (must match these exactly):
- Diphtheria
- HPV
- Hepatitis B
- Hib
- Measles
- Meningococcal
- Mumps
- Pertussis
- Pneumococcal
- Polio
- Rotavirus
- Rubella
- Tetanus
- Varicella
- Other
Purpose: Normalizes raw input disease strings to canonical disease names
Format:
{
"raw_input_variant": "canonical_disease_name",
...
}Usage:
- Loaded in
pipeline/translation_helpers.py - Called by
normalize_disease()in preprocessing - Handles input variants that differ from canonical names
- If a variant is not in this map, the input is returned unchanged (may still map via other mechanisms)
Example:
{
"Poliomyelitis": "Polio",
"Human papilloma virus infection": "HPV",
"Haemophilus influenzae infection, invasive": "Hib"
}Purpose: Stores language-specific translations of disease names for display
Structure:
translations/
├── en_diseases_overdue.json # English labels for overdue vaccines list
├── fr_diseases_overdue.json # French labels for overdue vaccines list
├── en_diseases_chart.json # English labels for immunization chart
└── fr_diseases_chart.json # French labels for immunization chart
Format (same for all translation files):
{
"canonical_disease_name": "display_label",
...
}Usage:
- Loaded in
pipeline/translation_helpers.py - Called by
display_label()when rendering notices - Two domains:
- diseases_overdue: Labels for the "vaccines due" section
- diseases_chart: Labels for the immunization history table
- Different labels possible per domain (e.g., "Polio" vs "Poliomyelitis" in chart)
Example:
{
"Polio": "Polio",
"Measles": "Measles",
"Diphtheria": "Diphtheria"
}Both QR code payloads and PDF password generation use centralized template field validation through the TemplateField enum (see pipeline/enums.py). This ensures consistent, safe placeholder handling across all template rendering steps.
| Field | Format | Example | Notes |
|---|---|---|---|
client_id |
String | 12345 |
Unique client identifier |
first_name |
String | John |
Client's given name |
last_name |
String | Doe |
Client's family name |
name |
String | John Doe |
Full name (auto-combined) |
date_of_birth |
Localized date | Jan 1, 2020 or 1 janvier 2020 |
Formatted per language |
date_of_birth_iso |
ISO 8601 | 2020-01-01 |
YYYY-MM-DD format |
date_of_birth_iso_compact |
Compact ISO | 20200101 |
YYYYMMDD format (no hyphens) |
school |
String | Lincoln School |
School name |
board |
String | TDSB |
School board name |
street_address |
String | 123 Main St |
Full street address |
city |
String | Toronto |
City/municipality |
province |
String | ON |
Province/territory |
postal_code |
String | M5V 3A8 |
Postal/ZIP code |
language_code |
String | en or fr |
ISO 639-1 language code |
All template placeholders are validated at runtime:
- ✅ Placeholders must exist in the generated context
- ✅ Placeholders must be in the allowed field list (no typos like
{client_ID}) - ✅ Invalid placeholders raise clear error messages with allowed fields listed
This prevents silent failures from configuration typos and ensures templates are correct before processing.
QR code generation can be enabled/disabled in config/parameters.yaml under the qr section. The payload supports flexible templating using client metadata as placeholders.
Refer to the Template Field Reference for the complete list of supported placeholders.
Example override in config/parameters.yaml:
qr:
enabled: true
payload_template: https://www.test-immunization.ca/update?client_id={client_id}&dob={date_of_birth_iso}&lang={language_code}Tip:
- Use
{date_of_birth_iso}or{date_of_birth_iso_compact}for predictable date formats - The delivery date available to templates is
date_notice_delivery
After updating the configuration, rerun the pipeline and regenerated notices will reflect the new QR payload.
The PDF validation step runs after compilation to enforce basic quality rules and surface layout issues. Configuration lives under pdf_validation in config/parameters.yaml.
Supported severity levels per rule:
disabled: skip the checkwarn: include in summary but do not halt pipelineerror: fail the pipeline if any PDFs violate the rule
Current rules:
envelope_window_1_125: Ensure contact area does not exceed 1.125" inchesexactly_two_pages: Ensure each notice has exactly 2 pages (notice + immunization record)signature_overflow: Detect if the signature block spills onto page 2 (uses invisible Typst marker)
Example configuration:
pdf_validation:
rules:
envelope_window_1_125: error
exactly_two_pages: warn
signature_overflow: disabledBehavior:
- The validation summary is always printed to the console.
- A JSON report is written to
output/metadata/<lang>_validation_<run_id>.jsonwith per-PDF results and aggregates. - If any rule is set to
errorand fails, the pipeline stops with a clear error message listing failing rules and counts. - The validation logic is implemented in
pipeline/validate_pdfs.pyand invoked by the orchestrator. - The validation uses invisible markers embedded by the Typst templates to detect signature placement without affecting appearance.
PDF encryption can be customized in config/parameters.yaml under the encryption section. Passwords are built via the same placeholder templating used for QR payloads.
Refer to the Template Field Reference for the complete list of supported placeholders.
Common strategies:
- Simple:
{date_of_birth_iso_compact}– DOB only - Compound:
{client_id}{date_of_birth_iso_compact}– ID + DOB - Formatted:
{client_id}-{date_of_birth_iso}– hyphenated
Sample configurations in config/parameters.yaml:
encryption:
enabled: false
password:
template: "{date_of_birth_iso_compact}"
# Or combine fields
password:
template: "{client_id}{date_of_birth_iso_compact}"
# Or hyphenate
password:
template: "{client_id}-{date_of_birth_iso}"All templates are validated at runtime to catch configuration errors early and provide clear, allowed-field guidance.
-
Update
vaccine_reference.json:- Add vaccine code mapping if needed
- Ensure all diseases use canonical names
-
Update all translation files (required):
translations/en_diseases_overdue.jsontranslations/fr_diseases_overdue.jsontranslations/en_diseases_chart.jsontranslations/fr_diseases_chart.json
-
Update
disease_normalization.json(if needed):- Add any input variants that map to this disease
-
Test:
uv run pytest tests/unit/test_translation_helpers.py::TestMultiLanguageSupport -v
-
Extend Language enum in
pipeline/enums.py -
Create translation files:
translations/{lang}_diseases_overdue.jsontranslations/{lang}_diseases_chart.json
-
Populate translations:
- Copy English content
- Translate all disease names to target language
-
Test:
uv run pytest -m "not e2e"