Skip to content

Review comment signal-to-noise ratio across codebase #97

@flipbit03

Description

@flipbit03

Context

PR #88 surfaced an issue: an LLM-assisted contributor's tooling auto-removed comments it deemed "self-descriptive code." Some of those comments were contextually important — they document crevices explored during development, design rationale, or non-obvious behavior that isn't self-evident from the code alone.

This is going to be a recurring pattern as more LLM-assisted contributions come in. LLM coding harnesses tend to strip comments aggressively under a "clean code" heuristic, without understanding which comments carry architectural or historical context.

Problem

There's no established guideline for what comments should exist in the codebase. This makes it:

  • Hard for contributors (human or LLM-assisted) to know what to keep vs remove
  • Easy for drive-by "cleanup" PRs to silently drop important context
  • Difficult to review comment removals without re-deriving the original reasoning

Proposal

  1. Audit existing comments in tests (crates/lineark/tests/) and core modules — which ones document non-obvious behavior, design decisions, or API quirks vs which are truly redundant?
  2. Establish a comment policy for the project, e.g.:
    • Keep comments that explain why, not what
    • Keep comments that document Linear API quirks or workarounds
    • Keep comments on test cases that explain what scenario is being validated and why
    • Remove comments that merely restate the code
  3. Add the policy to CLAUDE.md so both human contributors and LLM harnesses respect it

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions