Test Coverage Gap Analysis
Module: Microck/traccia
Task: test-gap
Severity: Medium — 4,130 lines of untested source code across 12 modules
Summary
Only 5 of 17 source modules (29%) have dedicated test files. The remaining 12 modules containing 4,130 lines of code have zero test coverage. The existing test suite (62 tests across 8 files) covers parsing, pipeline orchestration, models, source detection, and CLI init commands well, but leaves critical business logic untested.
Coverage Map
| Source Module |
Lines |
Has Tests |
Public Symbols |
Risk |
rendering.py |
930 |
✗ |
render_project, export_obsidian, ascii_tree, mermaid_tree |
HIGH — Largest untested module, output correctness is critical |
storage.py |
517 |
✗ |
Storage class with 20+ methods |
HIGH — Data persistence layer, all CRUD untested |
llm.py |
506 |
✗ |
extract_evidence, canonicalize, score_skill, healthcheck |
HIGH — Core AI pipeline, no backend contract tests |
family_normalizer.py |
449 |
✗ |
normalize_family_content, rendered_text |
MEDIUM — HTML normalization logic |
cli.py |
432 |
✗ |
26 CLI commands (init, lint, doctor, add, ingest, etc.) |
MEDIUM — Partially tested via test_init.py integration tests |
bootstrap.py |
390 |
✗ |
RepoInitializer, initialize |
MEDIUM — Project setup logic |
document_normalizer.py |
287 |
✗ |
normalize_document |
MEDIUM — Document processing |
extraction.py |
192 |
✗ |
extract_evidence |
MEDIUM — Evidence extraction |
pipeline_support.py |
163 |
✗ |
support_score, should_create_node, build_skill_node |
MEDIUM — Skill graph construction |
taxonomy.py |
123 |
✗ |
DomainDefinition, SkillDefinition, match_skill_names |
LOW — Schema definitions |
config.py |
96 |
✗ |
TracciaPaths, ThresholdConfig, load_config |
LOW — Config loading |
utils.py |
45 |
✗ |
slugify, short_hash, file_sha256, skill_id |
LOW — Utility functions |
Existing Test Coverage (62 tests, 8 files)
| Test File |
Tests |
What It Covers |
test_pipeline.py |
20 |
End-to-end pipeline: ingest, reingest, rendering, graph, manifests |
test_parsers.py |
13 |
Document parsing: chat, Instagram, Twitter, Reddit, DOCX, PDF |
test_init.py |
9 |
CLI init/doctor commands, OpenAI backend contract |
test_signal_handling.py |
7 |
Export classification, AI trace detection |
test_source_detection.py |
6 |
Family detection from paths and archives |
test_npm_wrapper.py |
4 |
NPM wrapper subprocess handling |
test_models.py |
2 |
Config schema and domain model validation |
test_fixtures.py |
1 |
Golden fixture loading |
Priority Recommendations
-
rendering.py (930 lines, HIGH risk) — Add tests for render_project, export_obsidian, ascii_tree, mermaid_tree. These produce the user-facing output and any bug here is immediately visible. Test with various graph structures (empty, single-node, deep tree, cycle).
-
storage.py (517 lines, HIGH risk) — Add integration tests for the Storage class. Test CRUD operations, query patterns, and error handling for missing/corrupt data. Use an in-memory SQLite fixture or mock the database layer.
-
llm.py (506 lines, HIGH risk) — Add contract tests for extract_evidence, canonicalize, score_skill. Mock the HTTP backend to test parsing of valid/invalid/malformed LLM responses. The existing test_init.py tests some OpenAI backend handling but doesn't cover the full extraction→canonicalization→scoring pipeline.
-
pipeline_support.py (163 lines, MEDIUM risk) — Unit test support_score, should_create_node, build_skill_node. These are pure functions with deterministic logic that are easy to test in isolation.
-
utils.py (45 lines, LOW risk but easy win) — Test slugify, short_hash, file_sha256, skill_id with edge cases (empty strings, unicode, special characters). Quick to implement, high confidence gain.
Estimated Effort
| Priority |
Module |
Estimated Tests |
Effort |
| P0 |
utils.py |
8-10 |
1 hour |
| P0 |
pipeline_support.py |
10-12 |
2 hours |
| P1 |
rendering.py |
15-20 |
4 hours |
| P1 |
storage.py |
12-15 |
4 hours |
| P2 |
llm.py |
12-15 |
3 hours |
| P2 |
family_normalizer.py |
8-10 |
2 hours |
| P3 |
config.py, taxonomy.py, extraction.py |
10-12 |
2 hours |
Generated by nightshift — autonomous code quality bot.
Test Coverage Gap Analysis
Module: Microck/traccia
Task: test-gap
Severity: Medium — 4,130 lines of untested source code across 12 modules
Summary
Only 5 of 17 source modules (29%) have dedicated test files. The remaining 12 modules containing 4,130 lines of code have zero test coverage. The existing test suite (62 tests across 8 files) covers parsing, pipeline orchestration, models, source detection, and CLI init commands well, but leaves critical business logic untested.
Coverage Map
rendering.pyrender_project,export_obsidian,ascii_tree,mermaid_treestorage.pyStorageclass with 20+ methodsllm.pyextract_evidence,canonicalize,score_skill,healthcheckfamily_normalizer.pynormalize_family_content,rendered_textcli.pyinit,lint,doctor,add,ingest, etc.)test_init.pyintegration testsbootstrap.pyRepoInitializer,initializedocument_normalizer.pynormalize_documentextraction.pyextract_evidencepipeline_support.pysupport_score,should_create_node,build_skill_nodetaxonomy.pyDomainDefinition,SkillDefinition,match_skill_namesconfig.pyTracciaPaths,ThresholdConfig,load_configutils.pyslugify,short_hash,file_sha256,skill_idExisting Test Coverage (62 tests, 8 files)
test_pipeline.pytest_parsers.pytest_init.pytest_signal_handling.pytest_source_detection.pytest_npm_wrapper.pytest_models.pytest_fixtures.pyPriority Recommendations
rendering.py(930 lines, HIGH risk) — Add tests forrender_project,export_obsidian,ascii_tree,mermaid_tree. These produce the user-facing output and any bug here is immediately visible. Test with various graph structures (empty, single-node, deep tree, cycle).storage.py(517 lines, HIGH risk) — Add integration tests for theStorageclass. Test CRUD operations, query patterns, and error handling for missing/corrupt data. Use an in-memory SQLite fixture or mock the database layer.llm.py(506 lines, HIGH risk) — Add contract tests forextract_evidence,canonicalize,score_skill. Mock the HTTP backend to test parsing of valid/invalid/malformed LLM responses. The existingtest_init.pytests some OpenAI backend handling but doesn't cover the full extraction→canonicalization→scoring pipeline.pipeline_support.py(163 lines, MEDIUM risk) — Unit testsupport_score,should_create_node,build_skill_node. These are pure functions with deterministic logic that are easy to test in isolation.utils.py(45 lines, LOW risk but easy win) — Testslugify,short_hash,file_sha256,skill_idwith edge cases (empty strings, unicode, special characters). Quick to implement, high confidence gain.Estimated Effort
utils.pypipeline_support.pyrendering.pystorage.pyllm.pyfamily_normalizer.pyconfig.py,taxonomy.py,extraction.pyGenerated by nightshift — autonomous code quality bot.