Remove caching of raw_segments_with_ancestors to prevent memory leaks#2430
Open
Remove caching of raw_segments_with_ancestors to prevent memory leaks#2430
Conversation
The `raw_segments_with_ancestors` method cached its result in a OnceCell inside NodeData. Each cached PathStep stored an Rc clone of its ancestor segment, creating a reference cycle (node -> cache -> PathStep -> node) that prevented the entire syntax tree from being deallocated. In long-running processes like the LSP server, this caused unbounded memory growth on every lint operation. Fix by removing the OnceCell cache and computing the result fresh each call. The PathSteps are now temporary and properly dropped after use, breaking the cycle. https://claude.ai/code/session_01GhEcspeHFuBffJnphrDhLJ
Contributor
Benchmark for d750708Click to view benchmark
|
The test builds a small segment tree, calls raw_segments_with_ancestors(), and asserts that Rc::strong_count on the root is exactly 1 afterward. On the old code (with OnceCell caching), this test fails with strong_count=3 — the two cached PathSteps each hold an Rc clone back to the parent, creating a reference cycle that prevents deallocation. https://claude.ai/code/session_01GhEcspeHFuBffJnphrDhLJ
Contributor
Benchmark for 3c95ad5Click to view benchmark
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR removes the caching of
raw_segments_with_ancestorsfrom theNodeDatastruct and converts the method to compute the result on-demand instead. This change prevents reference cycles that were causing unbounded memory growth in long-running processes like the LSP server.Key Changes
OnceCellfield: Deleted theraw_segments_with_ancestors: OnceCell<Vec<(ErasedSegment, Vec<PathStep>)>>field fromNodeDatastructraw_segments_with_ancestors()method to returnVec<(ErasedSegment, Vec<PathStep>)>instead of&[(ErasedSegment, Vec<PathStep>)]to support on-demand computationget_or_init()calls with direct computation of the resultDepthMap::from_parent()to store the computed vector before iterating over itSegmentBuilderandErasedSegment::clone()Implementation Details
The issue was that
PathStepstores anRcclone of its ancestor segment, creating reference cycles when cached insideNodeData(which is itself behindRc). These cycles prevented the entire syntax tree from being deallocated, causing memory to grow unboundedly in long-running processes.By computing
raw_segments_with_ancestorson-demand rather than caching it, we avoid creating persistent reference cycles while maintaining the same functionality. The performance impact is acceptable since this method is not called in hot paths.Added detailed comments explaining why caching is intentionally avoided to prevent future regressions.
https://claude.ai/code/session_01GhEcspeHFuBffJnphrDhLJ