[nightshift] test-gap: 4,130 lines across 12 modules lack test coverage

## Test Coverage Gap Analysis

**Module:** Microck/traccia
**Task:** test-gap
**Severity:** Medium — 4,130 lines of untested source code across 12 modules

---

### Summary

Only 5 of 17 source modules (29%) have dedicated test files. The remaining 12 modules containing 4,130 lines of code have **zero test coverage**. The existing test suite (62 tests across 8 files) covers parsing, pipeline orchestration, models, source detection, and CLI init commands well, but leaves critical business logic untested.

### Coverage Map

| Source Module | Lines | Has Tests | Public Symbols | Risk |
|---|---|---|---|---|
| `rendering.py` | 930 | ✗ | `render_project`, `export_obsidian`, `ascii_tree`, `mermaid_tree` | **HIGH** — Largest untested module, output correctness is critical |
| `storage.py` | 517 | ✗ | `Storage` class with 20+ methods | **HIGH** — Data persistence layer, all CRUD untested |
| `llm.py` | 506 | ✗ | `extract_evidence`, `canonicalize`, `score_skill`, `healthcheck` | **HIGH** — Core AI pipeline, no backend contract tests |
| `family_normalizer.py` | 449 | ✗ | `normalize_family_content`, `rendered_text` | **MEDIUM** — HTML normalization logic |
| `cli.py` | 432 | ✗ | 26 CLI commands (`init`, `lint`, `doctor`, `add`, `ingest`, etc.) | **MEDIUM** — Partially tested via `test_init.py` integration tests |
| `bootstrap.py` | 390 | ✗ | `RepoInitializer`, `initialize` | **MEDIUM** — Project setup logic |
| `document_normalizer.py` | 287 | ✗ | `normalize_document` | **MEDIUM** — Document processing |
| `extraction.py` | 192 | ✗ | `extract_evidence` | **MEDIUM** — Evidence extraction |
| `pipeline_support.py` | 163 | ✗ | `support_score`, `should_create_node`, `build_skill_node` | **MEDIUM** — Skill graph construction |
| `taxonomy.py` | 123 | ✗ | `DomainDefinition`, `SkillDefinition`, `match_skill_names` | **LOW** — Schema definitions |
| `config.py` | 96 | ✗ | `TracciaPaths`, `ThresholdConfig`, `load_config` | **LOW** — Config loading |
| `utils.py` | 45 | ✗ | `slugify`, `short_hash`, `file_sha256`, `skill_id` | **LOW** — Utility functions |

### Existing Test Coverage (62 tests, 8 files)

| Test File | Tests | What It Covers |
|---|---|---|
| `test_pipeline.py` | 20 | End-to-end pipeline: ingest, reingest, rendering, graph, manifests |
| `test_parsers.py` | 13 | Document parsing: chat, Instagram, Twitter, Reddit, DOCX, PDF |
| `test_init.py` | 9 | CLI init/doctor commands, OpenAI backend contract |
| `test_signal_handling.py` | 7 | Export classification, AI trace detection |
| `test_source_detection.py` | 6 | Family detection from paths and archives |
| `test_npm_wrapper.py` | 4 | NPM wrapper subprocess handling |
| `test_models.py` | 2 | Config schema and domain model validation |
| `test_fixtures.py` | 1 | Golden fixture loading |

### Priority Recommendations

1. **`rendering.py`** (930 lines, HIGH risk) — Add tests for `render_project`, `export_obsidian`, `ascii_tree`, `mermaid_tree`. These produce the user-facing output and any bug here is immediately visible. Test with various graph structures (empty, single-node, deep tree, cycle).

2. **`storage.py`** (517 lines, HIGH risk) — Add integration tests for the `Storage` class. Test CRUD operations, query patterns, and error handling for missing/corrupt data. Use an in-memory SQLite fixture or mock the database layer.

3. **`llm.py`** (506 lines, HIGH risk) — Add contract tests for `extract_evidence`, `canonicalize`, `score_skill`. Mock the HTTP backend to test parsing of valid/invalid/malformed LLM responses. The existing `test_init.py` tests some OpenAI backend handling but doesn't cover the full extraction→canonicalization→scoring pipeline.

4. **`pipeline_support.py`** (163 lines, MEDIUM risk) — Unit test `support_score`, `should_create_node`, `build_skill_node`. These are pure functions with deterministic logic that are easy to test in isolation.

5. **`utils.py`** (45 lines, LOW risk but easy win) — Test `slugify`, `short_hash`, `file_sha256`, `skill_id` with edge cases (empty strings, unicode, special characters). Quick to implement, high confidence gain.

### Estimated Effort

| Priority | Module | Estimated Tests | Effort |
|---|---|---|---|
| P0 | `utils.py` | 8-10 | 1 hour |
| P0 | `pipeline_support.py` | 10-12 | 2 hours |
| P1 | `rendering.py` | 15-20 | 4 hours |
| P1 | `storage.py` | 12-15 | 4 hours |
| P2 | `llm.py` | 12-15 | 3 hours |
| P2 | `family_normalizer.py` | 8-10 | 2 hours |
| P3 | `config.py`, `taxonomy.py`, `extraction.py` | 10-12 | 2 hours |

---

*Generated by [nightshift](https://github.com/marcus/nightshift) — autonomous code quality bot.*


Source Module	Lines	Has Tests	Public Symbols	Risk
`rendering.py`	930	✗	`render_project`, `export_obsidian`, `ascii_tree`, `mermaid_tree`	HIGH — Largest untested module, output correctness is critical
`storage.py`	517	✗	`Storage` class with 20+ methods	HIGH — Data persistence layer, all CRUD untested
`llm.py`	506	✗	`extract_evidence`, `canonicalize`, `score_skill`, `healthcheck`	HIGH — Core AI pipeline, no backend contract tests
`family_normalizer.py`	449	✗	`normalize_family_content`, `rendered_text`	MEDIUM — HTML normalization logic
`cli.py`	432	✗	26 CLI commands (`init`, `lint`, `doctor`, `add`, `ingest`, etc.)	MEDIUM — Partially tested via `test_init.py` integration tests
`bootstrap.py`	390	✗	`RepoInitializer`, `initialize`	MEDIUM — Project setup logic
`document_normalizer.py`	287	✗	`normalize_document`	MEDIUM — Document processing
`extraction.py`	192	✗	`extract_evidence`	MEDIUM — Evidence extraction
`pipeline_support.py`	163	✗	`support_score`, `should_create_node`, `build_skill_node`	MEDIUM — Skill graph construction
`taxonomy.py`	123	✗	`DomainDefinition`, `SkillDefinition`, `match_skill_names`	LOW — Schema definitions
`config.py`	96	✗	`TracciaPaths`, `ThresholdConfig`, `load_config`	LOW — Config loading
`utils.py`	45	✗	`slugify`, `short_hash`, `file_sha256`, `skill_id`	LOW — Utility functions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[nightshift] test-gap: 4,130 lines across 12 modules lack test coverage #21

Test Coverage Gap Analysis

Summary

Coverage Map

Existing Test Coverage (62 tests, 8 files)

Priority Recommendations

Estimated Effort

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Test File	Tests	What It Covers
`test_pipeline.py`	20	End-to-end pipeline: ingest, reingest, rendering, graph, manifests
`test_parsers.py`	13	Document parsing: chat, Instagram, Twitter, Reddit, DOCX, PDF
`test_init.py`	9	CLI init/doctor commands, OpenAI backend contract
`test_signal_handling.py`	7	Export classification, AI trace detection
`test_source_detection.py`	6	Family detection from paths and archives
`test_npm_wrapper.py`	4	NPM wrapper subprocess handling
`test_models.py`	2	Config schema and domain model validation
`test_fixtures.py`	1	Golden fixture loading

Priority	Module	Estimated Tests	Effort
P0	`utils.py`	8-10	1 hour
P0	`pipeline_support.py`	10-12	2 hours
P1	`rendering.py`	15-20	4 hours
P1	`storage.py`	12-15	4 hours
P2	`llm.py`	12-15	3 hours
P2	`family_normalizer.py`	8-10	2 hours
P3	`config.py`, `taxonomy.py`, `extraction.py`	10-12	2 hours

[nightshift] test-gap: 4,130 lines across 12 modules lack test coverage #21

Description

Test Coverage Gap Analysis

Summary

Coverage Map

Existing Test Coverage (62 tests, 8 files)

Priority Recommendations

Estimated Effort

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions