Symbol usage index — find-usages, callers-of, impact analysis

## Goal

Build a **symbol usage index** — every public class, interface, method, and configuration key in the project gets a queryable record of where it's used. Exposed via CLI and as the `framework__find_usages` MCP tool. Refactoring without this is scary; with it, refactoring becomes routine.

## Why

Agents struggle with refactors because they can't confidently answer "what depends on X?" Grep produces noise (matches in comments, docblocks, unrelated identifiers); manual reading scales badly. A real index — built from PHP-Parser's AST + the framework's spec/config awareness — gives precise, structured answers in milliseconds.

Specifically, the index knows:

- Class usage: `new X(...)`, `extends X`, `implements X`, `X::class`, return/param/property type hints, attribute classes
- Interface implementations and extenders
- Method usage: `$obj->method()`, `Class::method()`, callable references `[$obj, 'method']` and `$obj->method(...)`
- Property reads/writes
- Constant references
- Configuration keys referenced from env / spec / code

Crucially, it **also knows about the framework's higher-level constructs**:

- Which spec files declare endpoints that resolve to which Action classes
- Which routes match which middleware
- Which entities are referenced from which specs
- Which events have which listeners

So the agent can ask: "if I rename `User`, what do I need to change?" and get a complete answer.

## Index format

A single SQLite database at `.altair/index.db`. Schema:

```sql
CREATE TABLE symbols (
    id INTEGER PRIMARY KEY,
    fqn TEXT NOT NULL UNIQUE,           -- e.g. "App\User\User", "App\User\User::register"
    kind TEXT NOT NULL,                  -- 'class' | 'interface' | 'trait' | 'enum' | 'method' | 'property' | 'constant'
    file TEXT NOT NULL,
    line INTEGER NOT NULL,
    visibility TEXT,                     -- 'public' | 'protected' | 'private' | null for classes
    is_readonly INTEGER DEFAULT 0,
    is_static INTEGER DEFAULT 0
);

CREATE TABLE usages (
    id INTEGER PRIMARY KEY,
    symbol_id INTEGER REFERENCES symbols(id),
    used_in_file TEXT NOT NULL,
    used_in_line INTEGER NOT NULL,
    usage_kind TEXT NOT NULL,            -- 'new' | 'extends' | 'implements' | 'type_hint' | 'call' | 'property_read' | 'property_write' | 'attribute' | 'spec_endpoint' | 'spec_entity' | 'route_middleware'
    context TEXT                         -- optional context (e.g. enclosing method)
);

CREATE INDEX usages_by_symbol ON usages(symbol_id);
CREATE INDEX symbols_by_kind ON symbols(kind);

CREATE TABLE meta (
    key TEXT PRIMARY KEY,
    value TEXT
);
-- meta entries: last_built_at, framework_version, file_hashes (for incremental rebuild)
```

SQLite chosen because:
- Zero setup (no daemon)
- Fast queries even with 100k+ symbols
- Easy to ship in `.altair/` (gitignored — it's a derived artifact)
- Inspectable by humans with `sqlite3` CLI

## CLI surface

```bash
bin/altair index build                                    # full rebuild
bin/altair index build --incremental                      # only changed files
bin/altair index find-usages "App\\User\\User"           # all usages
bin/altair index find-usages "App\\User\\User::register"  # method usages only
bin/altair index implements "Altair\\Http\\Contracts\\MiddlewareInterface"
bin/altair index extends "App\\Base\\Entity"
bin/altair index callers-of "App\\User\\CreateUser::__invoke"
bin/altair index unused                                   # symbols with zero usages (dead code candidates)
bin/altair index orphans                                  # specs without routes, routes without specs, etc.
```

`--format=json` on any of these returns structured output for MCP.

## MCP tools

| Tool | Inputs | Returns |
|---|---|---|
| `framework__find_usages` | `symbol: string, kind?: string` | List of `{file, line, usage_kind, context}` |
| `framework__implementers` | `interface: string` | List of classes implementing the interface |
| `framework__callers` | `method: string` | List of code locations calling the method |
| `framework__dead_code` | — | List of symbols with zero recorded usages |
| `framework__impact` | `symbols: string[]` | Aggregate impact across tests, specs, and other files |

`framework__impact` is the key one for refactoring confidence. Given a set of symbols the agent plans to change, it returns:

```json
{
  "symbols": ["App\\User\\User"],
  "impact": {
    "files": 23,
    "tests": 8,
    "specs": 3,
    "estimated_test_runtime_ms": 1240
  },
  "by_file": [...],
  "tests_to_run": ["tests/Http/Actions/CreateUserActionTest.php", "..."],
  "specs_affected": ["api/users/create.yaml", "api/posts/list.yaml"]
}
```

The agent uses `tests_to_run` as an optimisation — runs only those before declaring success.

## Index building

PHP-Parser walks every PHP file under `src/`, `app/`, and `tests/`. The framework-aware augmentations:

- Parse YAML specs under `api/` to add `spec_endpoint` and `spec_entity` usages
- Parse `config/routes.php` (or whatever routes manifest the project uses) for middleware references
- Parse `config/configurations.php` for container bindings

The index gets stale fast on a vibe-coding session. **Incremental rebuild** is essential:

- Track file content hashes in `meta.file_hashes`
- On `index build --incremental`, only reparse files whose hashes changed
- Re-link usages for affected symbols

For 1000-file projects, incremental rebuild should be under 500ms. Full rebuild under 5 seconds.

## Auto-rebuild

The MCP server can opt to auto-trigger an incremental rebuild before every `find_usages` / `impact` call — at the cost of latency. Default: trigger if last_built_at is older than 60s OR file changes are detected via mtime scan.

Alternative: a long-lived watcher process (`bin/altair index watch`) that rebuilds on file change. Optional. Mentioned as a follow-up; v1 ships on-demand.

## Shape

```
src/Altair/Index/
├── Cli/
│   ├── BuildCommand.php
│   ├── FindUsagesCommand.php
│   ├── ImplementsCommand.php
│   ├── ExtendsCommand.php
│   ├── CallersOfCommand.php
│   ├── UnusedCommand.php
│   └── OrphansCommand.php
├── Mcp/
│   ├── FindUsagesTool.php
│   ├── ImplementersTool.php
│   ├── CallersTool.php
│   ├── DeadCodeTool.php
│   └── ImpactTool.php
├── Parser/
│   ├── PhpFileWalker.php           # uses nikic/php-parser
│   ├── YamlSpecWalker.php
│   └── ConfigWalker.php
├── Storage/
│   └── SqliteIndexStorage.php
├── Query/
│   ├── UsageQuery.php
│   └── ImpactQuery.php
└── composer.json
```

## Acceptance criteria

- [ ] Full index build of the framework's own monorepo completes in <5s
- [ ] Incremental rebuild after one-file change completes in <500ms
- [ ] `find-usages` returns accurate results for all 7 usage_kinds listed in the schema
- [ ] `framework__impact` correctly enumerates the test files that depend on a symbol — verified against the framework's own test suite
- [ ] `framework__dead_code` finds at least one true positive in a project with intentional dead code (test fixture)
- [ ] Spec-aware usages: `find-usages App\\User\\User` returns `spec_endpoint` rows for YAML specs that reference it
- [ ] Index survives a `bin/altair spec scaffold` operation (auto-rebuilds or invalidates affected files)
- [ ] `.altair/index.db` is gitignored by the skeleton
- [ ] Tests:
  - `PhpFileWalker`: golden tests for each usage_kind
  - `ImpactQuery`: known-state fixture project, assert aggregate counts
  - End-to-end: build index against framework monorepo, sample queries return expected results
  - Incremental: build, modify one file, rebuild, assert only that file's symbols re-indexed

## Out of scope

- Cross-language indexing (the emitted SDKs are owned by their language toolchain)
- Type inference beyond what PHPStan provides (defer to PHPStan for that)
- Documentation lookups (manifests own that)
- Full call-graph analysis (caller-of is shallow; deep transitive analysis is too expensive for the value)

## Dependencies

- **#17 (`univeros/cli`)** — required
- **#19 (`univeros/scaffold`)** — required for spec-aware usages
- **#69 (`univeros/mcp`)** — required for tool exposure
- **#68 (`bin/altair doctor`)** — recommended (doctor check: `index_stale` warns when the index is older than the source)

New composer deps:

- `nikic/php-parser: ^5.0` (already pulled in by #19)
- No SQLite extension dep — uses PDO with SQLite, which ships with PHP

## Why this matters

Refactoring is the activity that separates a real codebase from a demo. Without confidence in "what depends on this," agents either over-refactor (changing too much) or under-refactor (avoiding necessary changes). With a real usage index, agents can make precise, surgical changes — the way a senior engineer with IDE support does today.


Tool	Inputs	Returns
`framework__find_usages`	`symbol: string, kind?: string`	List of `{file, line, usage_kind, context}`
`framework__implementers`	`interface: string`	List of classes implementing the interface
`framework__callers`	`method: string`	List of code locations calling the method
`framework__dead_code`	—	List of symbols with zero recorded usages
`framework__impact`	`symbols: string[]`	Aggregate impact across tests, specs, and other files

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Symbol usage index — find-usages, callers-of, impact analysis #78

Goal

Why

Index format

CLI surface

MCP tools

Index building

Auto-rebuild

Shape

Acceptance criteria

Out of scope

Dependencies

Why this matters

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Symbol usage index — find-usages, callers-of, impact analysis #78

Description

Goal

Why

Index format

CLI surface

MCP tools

Index building

Auto-rebuild

Shape

Acceptance criteria

Out of scope

Dependencies

Why this matters

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions