feat(refs): Tree-sitter identifier-aware symbol references#49
feat(refs): Tree-sitter identifier-aware symbol references#49PatrickSys merged 2 commits intomasterfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2aa0831510
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Greptile SummaryThis PR enhances Major changes:
Code quality:
Confidence Score: 5/5
|
| Filename | Overview |
|---|---|
| src/core/symbol-references.ts | Refactored to use Tree-sitter for identifier matching when available, with proper fallback to regex. Clean implementation with good error handling. |
| src/utils/tree-sitter.ts | Added findIdentifierOccurrences function with proper AST traversal and comment/string filtering. Includes proper resource cleanup and error handling. |
| tests/get-symbol-references.test.ts | Added comprehensive test validating that comments and strings are excluded from symbol references when Tree-sitter is available. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
Start([findSymbolReferences called]) --> LoadIndex[Load keyword index]
LoadIndex --> Prefilter[Prefilter chunks with regex]
Prefilter --> GroupFiles[Group chunks by file path]
GroupFiles --> IterateFiles{For each file}
IterateFiles --> CheckFile{File exists<br/>on disk?}
CheckFile -->|No| FallbackRegex[Use chunk-regex fallback]
CheckFile -->|Yes| ReadFile[Read file content]
ReadFile --> DetectLang[Detect language]
DetectLang --> CheckTreeSitter{Tree-sitter<br/>supported?}
CheckTreeSitter -->|No| FallbackRegex
CheckTreeSitter -->|Yes| ParseAST[Parse with Tree-sitter]
ParseAST --> CheckError{Parse<br/>error?}
CheckError -->|Yes| FallbackRegex
CheckError -->|No| TraverseAST[Traverse AST for identifiers]
TraverseAST --> FilterContext[Filter out comments/strings]
FilterContext --> Dedupe[Deduplicate occurrences]
Dedupe --> CountMatches[Add to usageCount]
CountMatches --> NextFile[Continue to next file]
FallbackRegex --> RegexScan[Regex scan in chunks]
RegexScan --> CountRegex[Add matches to usageCount]
CountRegex --> NextFile
NextFile --> IterateFiles
IterateFiles -->|Done| Return[Return results with usages]
Return --> End([End])
Last reviewed commit: 2aa0831
tests/get-symbol-references.test.ts
Outdated
| ); | ||
|
|
||
| const { server } = await import('../src/index.js'); | ||
| const handler = (server as any)._requestHandlers.get('tools/call'); |
There was a problem hiding this comment.
Violates AGENTS.md guideline: "Avoid using any Type AT ALL COSTS." Consider using a typed interface or unknown with proper type guards to access the handler.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
|
Addressed review feedback:\n- Security: avoid reading index-provided paths outside CODEBASE_ROOT. We now only read files when the resolved path stays under root; otherwise we fall back to chunk-regex. (commit 1735e3c)\n- Tests: removed newly-introduced �s any access by adding a typed tools/call handler helper. (commit 1735e3c) |
Improve refs --symbol precision by using Tree-sitter identifier occurrences when the file exists and a curated grammar is available; fall back to legacy chunk-regex otherwise. Adds coverage for excluding comment/string-only matches. No docs changes.