Skip to content

nightshift: tech-debt-classify analysis #28

@nightshift-micr

Description

@nightshift-micr

nightshift: Tech Debt Classification

Repository: Microck/traccia
Task: tech-debt-classify
Category: options
Date: 2026-04-23


Summary

Classified tech debt across 15 source modules (~260KB source). Traccia is a Python 3.12+ local-first skill graph compiler using Pydantic v2, Typer CLI, and SQLite storage. The codebase is well-structured but has several areas of accumulated debt.

Debt Classification

🔴 HIGH — Model surface area (src/traccia/models.py, 296 lines)

Issue: models.py contains 20 enum types and 12 model classes in a single file. This is the most imported module in the codebase.

Risk: Any change to an enum or model triggers widespread re-imports. No clear internal organization.

Recommendation: Split into src/traccia/enums.py (all 20 StrEnum types) and src/traccia/models.py (data models only). Re-export from __init__.py for backward compatibility.

Estimated effort: 1-2 hours.

🔴 HIGH — God module: pipeline.py (1251 lines, 51KB)

Issue: The pipeline module handles discovery, ingestion, extraction, canonicalization, scoring, rendering orchestration, and export — all in one file.

Risk: Hard to test individual stages in isolation. High cognitive load for any change.

Recommendation: Split into stage modules: pipeline/discover.py, pipeline/extract.py, pipeline/canonicalize.py, pipeline/score.py, pipeline/render.py, pipeline/export.py. Keep pipeline.py as an orchestrator.

Estimated effort: 4-8 hours.

🟡 MEDIUM — LLM backend has 19 raise statements with inconsistent error types

Issue: src/traccia/llm.py (538 lines) raises BackendError, TimeoutError, subprocess.SubprocessError, json.JSONDecodeError, and a private _HttpResponseError. No clean hierarchy.

Risk: CLI error handling catches broad Exception because there's no clean error hierarchy to match on.

Recommendation: Introduce a BackendError hierarchy: BackendConnectionError, BackendResponseError, BackendAPIError, BackendConfigError.

Estimated effort: 3-4 hours.

🟡 MEDIUM — parsers.py (700+ lines) handles 13 source types in one module

Issue: All parsers live in a single file with deeply nested if/elif chains.

Risk: Adding a new source type requires modifying the same large file. Test isolation is difficult.

Recommendation: Use a registry pattern with @register_parser(SourceType.MARKDOWN) decorators.

Estimated effort: 4-6 hours.

🟡 MEDIUM — Bare error propagation in storage.py

Issue: Many storage methods return bare err without wrapping context about which table/record/operation failed.

Recommendation: Wrap errors with operation context: raise StorageError(f"failed to upsert skill node {node.skill_id}: {err}") from err

Estimated effort: 2-3 hours.

🟢 LOW — Ruff config only enables F (Pyflakes) and I (isort) rules

Issue: pyproject.toml line 54: select = ["F", "I"]. Misses bugbears (B), comprehensions (C4), simplified ranges (SIM), etc.

Recommendation: Enable ["F", "I", "B", "SIM", "C4", "UP"] and fix violations incrementally.

Estimated effort: 1-2 hours.

🟢 LOW — No type checking in CI

Issue: No mypy/pyright config. CI doesn't run type checking. With Pydantic v2 models, type coverage would catch schema drift early.

Recommendation: Add pyright or mypy to dev dependencies and CI.

Estimated effort: 2-3 hours.

Debt Summary

Priority Count Estimated Total Effort
🔴 HIGH 2 5-10 hours
🟡 MEDIUM 3 9-13 hours
🟢 LOW 2 3-5 hours
Total 7 items 17-28 hours

Recommended Order

  1. Split models.py enums (low risk, high impact, quick win)
  2. Extend ruff rules (quick win)
  3. Split pipeline.py into stage modules (highest value)
  4. Improve error hierarchy in llm.py
  5. Refactor parsers.py to registry pattern
  6. Add type checking to CI

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions