Test Suite

Unit tests for the jhcontext-crewai infrastructure layer. These tests verify the storage backends, local mode switching, and domain ontology validation — they do not run CrewAI agents or call LLMs.

Running Tests

# From the project root
.venv/bin/python -m pytest tests/ --tb=short

# With coverage
.venv/bin/python -m pytest tests/ --cov=api --cov=agent --tb=short

# Single file
.venv/bin/python -m pytest tests/test_sqlite_storage.py -v

Requirements: pip install -e ".[dev]" (installs pytest, pytest-cov, moto).

Test Files

test_sqlite_storage.py — Storage Backend (13 tests)

Tests the SQLite storage implementation that powers local development mode.

TestSQLiteStorage (8 tests):

Test	What it verifies
`test_save_and_get_envelope`	Round-trip: build envelope → save → retrieve → fields match
`test_get_nonexistent_envelope`	Returns `None` for unknown context_id (no crash)
`test_list_envelopes`	Filtering by `scope` returns correct subset
`test_list_envelopes_filter_by_risk`	Filtering by `risk_level` (high/low) works
`test_save_and_get_prov_graph`	PROV graph serialized as Turtle → saved → retrieved intact
`test_get_nonexistent_prov`	Returns `None` for unknown PROV graph
`test_save_and_get_decision`	Decision object with outcome JSON → save → retrieve → fields match
`test_save_and_get_artifact`	Binary content + metadata → save to filesystem → retrieve both
`test_envelope_overwrite`	`INSERT OR REPLACE` updates existing envelope (same context_id)

TestSQLitePIIVault (5 tests):

Test	What it verifies
`test_store_and_retrieve`	PII token round-trip (email → token → retrieve)
`test_retrieve_nonexistent`	Returns `None` for unknown token
`test_retrieve_by_context`	Retrieves all PII tokens for a given context_id
`test_purge_by_context`	Deletes all tokens for a context, leaves others intact
`test_purge_expired`	Deletes tokens older than a cutoff timestamp (GDPR retention)

Reading results: All tests use tmp_path fixtures — each test gets a fresh SQLite database in a temp directory. Failures here mean the storage layer has a regression. Since DynamoDB implements the same StorageBackend protocol with the same 9 methods, a SQLite failure likely indicates a bug that would also affect production.

test_app_local_mode.py — Mode Switching (3 tests)

Tests that the JHCONTEXT_LOCAL environment variable correctly switches between SQLite and DynamoDB backends in the Chalice app.

Test	What it verifies
`test_local_mode_uses_sqlite`	`JHCONTEXT_LOCAL=1` → `get_storage()` returns `SQLiteStorage`, `get_pii_vault()` returns `SQLitePIIVault`
`test_default_mode_is_not_local`	Without env var → local mode is `False` (DynamoDB would be used)
`test_full_roundtrip`	End-to-end: create envelope → save PROV graph → save decision → retrieve all → verify fields

Reading results: test_local_mode_uses_sqlite is skipped (s) if Chalice is not installed — this is expected in agent-only dev environments. The roundtrip test exercises the complete flow: EnvelopeBuilder → SQLiteStorage → PROVGraph → Decision, which is the same sequence the real flows execute.

test_offline_layer.py — Offline Queue + Sync Manager (4 tests)

Tests the offline protocol layer (agent/protocol/offline_queue.py + sync_manager.py) used by the offline healthcare scenarios. No CrewAI, no LLM, no API key required.

Test	What it verifies
`test_clean_drain_chain_verified`	Three envelopes enqueued offline with correct predecessor-hash chain → all drain cleanly when uplink returns
`test_tamper_detected`	Envelope whose stored `content_hash` disagrees with re-computed hash is marked `tampered`, not submitted upstream
`test_chain_broken_detected`	Envelope referencing a wrong predecessor hash is marked `chain_broken`, not submitted upstream
`test_late_flag_when_drain_window_is_days_later`	Envelope drained more than the threshold (default 6 h) after queueing is flagged `late`

test_offline_flow_e2e.py — End-to-End Offline Flow (2 tests)

Exercises the full OfflineContextMixin → OfflineQueue → SyncManager chain as a production offline-healthcare flow would, but without running any CrewAI agent.

Test	What it verifies
`test_three_handoff_chain_drains_cleanly`	Drives the mixin through three simulated handoffs (physio → triage → allocation) and verifies all three drain with intact predecessor-hash chain
`test_tamper_is_detected_on_drain`	After the flow enqueues, mutate the stored envelope_json directly in SQLite → drain correctly flags only that envelope as tampered

test_ontologies.py — Domain Ontology Validation (15 tests)

Tests the UserML semantic payload structure and domain-specific predicate vocabularies.

TestHealthcareOntology (4 tests):

Test	What it verifies
`test_predicates_defined`	Healthcare predicates exist in all layers (observation, interpretation, situation)
`test_sample_healthcare_is_valid`	Sample payload passes `validate_semantic_payload()` with zero violations
`test_healthcare_observations`	Helper builds observation triples with correct predicates (demographic, lab_result, imaging_finding)
`test_healthcare_payload_structure`	Full UserML payload has `@model: "UserML"` + `layers` dict

TestEducationOntology (3 tests):

Test	What it verifies
`test_predicates_defined`	Education predicates (word_count, argument_quality, grade_assigned)
`test_sample_education_is_valid`	Sample payload validates cleanly
`test_education_interpretations`	Helper builds interpretation triples with default confidence

TestRecommendationOntology (2 tests):

Test	What it verifies
`test_predicates_defined`	Recommendation predicates (browse_event, category_affinity, active_shopper)
`test_sample_recommendation_is_valid`	Sample payload validates cleanly

TestValidator (6 tests):

Test	What it verifies
`test_valid_payload`	Known-good payload → `(True, [])`
`test_missing_model`	Missing `@model` key → violation reported
`test_invalid_predicate`	Unknown predicate → violation with predicate name in message
`test_missing_predicate_key`	Triple without `predicate` key → "missing 'predicate'" violation
`test_non_dict_payload`	String input → invalid (type check)
`test_missing_layers`	Missing `layers` key → violation reported

Reading results: Ontology test failures mean either:

A predicate was renamed/removed in agent/ontologies/*.py without updating the sample
The validator logic changed (e.g., new required field)
The UserML schema structure changed in the jhcontext SDK

These tests are the compile-time equivalent for the semantic_conformance check that runs at validation time. If ontology tests pass but semantic_conformance fails in a run, the problem is the LLM output format, not the ontology definitions.

Test Output Guide

tests/test_app_local_mode.py s..                    [  8%]
tests/test_ontologies.py ...............            [ 52%]
tests/test_sqlite_storage.py ................      [100%]

======================== 33 passed, 1 skipped ========================

Symbol	Meaning
`.`	Test passed
`s`	Test skipped (missing optional dependency like Chalice)
`F`	Test failed — assertion error (see traceback)
`E`	Test errored — unexpected exception (import error, missing fixture)

A healthy run shows 33 passed, 1 skipped for the infrastructure suite, plus 6 passed for the offline layer (test_offline_layer.py + test_offline_flow_e2e.py). The skip is test_local_mode_uses_sqlite when Chalice is not installed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test Suite

Running Tests

Test Files

test_sqlite_storage.py — Storage Backend (13 tests)

test_app_local_mode.py — Mode Switching (3 tests)

test_offline_layer.py — Offline Queue + Sync Manager (4 tests)

test_offline_flow_e2e.py — End-to-End Offline Flow (2 tests)

test_ontologies.py — Domain Ontology Validation (15 tests)

Test Output Guide

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Test Suite

Running Tests

Test Files

test_sqlite_storage.py — Storage Backend (13 tests)

test_app_local_mode.py — Mode Switching (3 tests)

test_offline_layer.py — Offline Queue + Sync Manager (4 tests)

test_offline_flow_e2e.py — End-to-End Offline Flow (2 tests)

test_ontologies.py — Domain Ontology Validation (15 tests)

Test Output Guide