Eliminate redundant pack file decompression during clone operations #16

Copilot · 2025-11-06T18:20:28Z

Profiling showed 99% of clone time spent in GeneratePackIndexV2, with 93% in FindCompressedLength. The code was decompressing the entire pack file twice: once to generate the index, then again to extract objects.

Changes

New direct parsing path

Added PackFile.ParseAllObjectsDirectly() that parses objects in a single pass without generating an index
Handles regular objects and deltas (OfsDelta, RefDelta) with correct dependency resolution
Maintains O(1) lookups for RefDelta base object resolution using dual dictionaries (by-offset and by-SHA1)

Updated clone operations

ParsePackFileAndGetCommit() now calls ParseAllObjectsDirectly() instead of GeneratePackIndexV2() → ParsePackIndexV2() → ParseAllObjects()
FetchBloblessCloneAsync() uses direct parsing
Blob fetching in LoadSubdirectoryContentsWithBloblessCloneAsync() uses direct parsing

Optimization details

Reduced allocations in SHA1 calculation using stackalloc for object headers
GeneratePackIndexV2() remains unchanged and available when actual .idx/.rev files are needed

Impact

Before:

var indexResult = PackIndex.GeneratePackIndexV2(packFileData);  // Pass 1: decompress all objects
var indexEntries = PackIndex.ParsePackIndexV2(indexResult.IndexData);
var objects = PackFile.ParseAllObjects(packFileData, indexEntries);  // Pass 2: decompress all objects again

After:

var objects = PackFile.ParseAllObjectsDirectly(packFileData);  // Single pass: decompress once

Expected performance: ~50% reduction in pack file parsing time for clone operations.

Original prompt

We have a performance problem to fix:
When cloning from a large repo, the profiler says we spent 99 percent of time, more than one minute in GeneratePackIndexV2.
Also, about 93 percent of that time was spent in FindCompressedLength

Do we even need GeneratePackIndexV2 on clone? Shouldnt we be able to just parse all objects out of the pack more directly? Relatedly, can we consolidate duplicate code in cloning methods that retrieves objects to add to the repo?

Can we improve efficiency on the index derivation, for the cases where its still needed?

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

…ration Co-authored-by: Viir <19209696+Viir@users.noreply.github.com>

…ndex-based parsing Co-authored-by: Viir <19209696+Viir@users.noreply.github.com>

…ocations Co-authored-by: Viir <19209696+Viir@users.noreply.github.com>

Viir · 2025-11-06T18:47:07Z

@copilot Describe this image

Initial plan

b1f0b9d

Copilot AI assigned Copilot and Viir Nov 6, 2025

Copilot started work on behalf of Viir November 6, 2025 18:22 View session

Copilot AI and others added 3 commits November 6, 2025 18:28

Add ParseAllObjectsDirectly method and eliminate redundant index gene…

1d9e6ff

…ration Co-authored-by: Viir <19209696+Viir@users.noreply.github.com>

Add test to verify ParseAllObjectsDirectly produces same results as i…

08da19d

…ndex-based parsing Co-authored-by: Viir <19209696+Viir@users.noreply.github.com>

Address code review feedback: optimize RefDelta lookup and reduce all…

0a24dd3

…ocations Co-authored-by: Viir <19209696+Viir@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix performance issues in cloning from large repositories~~ Eliminate redundant pack file decompression during clone operations Nov 6, 2025

Copilot AI requested a review from Viir November 6, 2025 18:36

Copilot finished work on behalf of Viir November 6, 2025 18:36

Copilot finished work on behalf of Viir November 6, 2025 18:52

Viir closed this Nov 6, 2025

Viir deleted the copilot/improve-clone-performance branch November 6, 2025 19:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Eliminate redundant pack file decompression during clone operations #16

Eliminate redundant pack file decompression during clone operations #16

Uh oh!

Copilot AI commented Nov 6, 2025 •

edited

Loading

Uh oh!

Viir commented Nov 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Eliminate redundant pack file decompression during clone operations #16

Eliminate redundant pack file decompression during clone operations #16

Uh oh!

Conversation

Copilot AI commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

New direct parsing path

Updated clone operations

Optimization details

Impact

Uh oh!

Viir commented Nov 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Nov 6, 2025 •

edited

Loading