Skip to content

fix: Skip pre-compaction rollback metadata reads in getValidInstantTimestamps#18544

Open
yihua wants to merge 2 commits into
apache:masterfrom
yihua:fix-skip-pre-compaction-rollbacks
Open

fix: Skip pre-compaction rollback metadata reads in getValidInstantTimestamps#18544
yihua wants to merge 2 commits into
apache:masterfrom
yihua:fix-skip-pre-compaction-rollbacks

Conversation

@yihua
Copy link
Copy Markdown
Contributor

@yihua yihua commented Apr 22, 2026

Describe the issue this Pull Request addresses

This PR addresses a performance issue in building metadata table-based file system view and constructing metadata table log file reader.

Summary and Changelog

This PR optimizes getValidInstantTimestamps in HoodieTableMetadataUtil to skip reading rollback metadata for rollbacks older than the latest MDT compaction instant. After compaction, rolled-back log blocks are already merged into base files, so pre-compaction rollback timestamps are no longer needed for log block filtering. This avoids sequential storage reads (GCS/S3) for old rollback instants, which causes blocking when the ConcurrentHashMap.computeIfAbsent lock is held during metadata reader opening, leading to Spark driver CPU throttling, e.g., for a timeline server running on Spark driver with a metadata table-based file system view, as the FSV needs to be refreshed before the timeline server can serve requests. When there are 100+ rollbacks on the active timeline, without the fix, the metadata reading for file listing can take 10s or even 100s of seconds, causing severe performance degradation. After this fix, the expected latency should be sub-second if there are a few rollbacks or 0 latency if all rollbacks are filtered out.

  • HoodieTableMetadataUtil.java: Compute rollbackFilterThreshold as the later of the earliest valid instant and the latest MDT compaction time. Only read rollback metadata for rollbacks newer than this threshold.
  • TestHoodieTableMetadataUtil.java: Two new tests:
    • testGetValidInstantTimestampsSkipsPreCompactionRollbacks — verifies pre-compaction rollbacks are skipped and post-compaction rollbacks are still read
    • testGetValidInstantTimestampsReadsAllRollbacksWithNoCompaction — verifies fallback behavior when no MDT compaction exists (all rollbacks read)

Impact

Significantly improves MDT read with log files when there are a lot of rollbacks on the data table timeline.

Risk Level

low

Documentation Update

N/A

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

Copy link
Copy Markdown
Contributor

@hudi-agent hudi-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.

Thanks for the contribution! This PR avoids unnecessary reads of pre-MDT-compaction rollback metadata in getValidInstantTimestamps by clamping the rollback filter threshold to max(earliestInstantTime, latestMdtCompactionTime), since rolled-back log blocks before the latest MDT compaction are already merged into base files. The fallback to SOLO_COMMIT_TIMESTAMP correctly preserves original behavior when no MDT compaction exists. No correctness issues found. A few style/readability suggestions in the inline comments. Please take a look, and this should be ready for a Hudi committer or PMC member to take it from here. One truncated comment in the test — the rest of the change looks clean.

cc @yihua

@github-actions github-actions Bot added the size:M PR with lines of changes in (100, 300] label Apr 23, 2026
@yihua yihua force-pushed the fix-skip-pre-compaction-rollbacks branch from e32a618 to 67eac7a Compare May 9, 2026 02:16
@hudi-bot
Copy link
Copy Markdown
Collaborator

hudi-bot commented May 9, 2026

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Copy link
Copy Markdown
Contributor

@hudi-agent hudi-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.

Thanks for the contribution! This PR avoids unnecessary sequential reads of pre-MDT-compaction rollback metadata in getValidInstantTimestamps by clamping the rollback filter threshold to max(earliestInstantTime, latestMdtCompactionTime). The fallback to SOLO_COMMIT_TIMESTAMP when no MDT compaction exists correctly preserves the original behavior. One question on the inline comment about consistency with the existing getLatestCompactionTime() helper. Please take a look, and this should be ready for a Hudi committer or PMC member to take it from here. A couple of small readability issues in the new test — one truncated comment and one variable name that implies the wrong shape of data.

// reads for old rollback instants that can cause long latency during metadata table reading.
final String earliestInstantTime = validInstantTimestamps.isEmpty() ? SOLO_COMMIT_TIMESTAMP : Collections.min(validInstantTimestamps);
final String latestMdtCompactionTime = metadataMetaClient.getActiveTimeline()
.getCommitTimeline()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 The existing HoodieBackedTableMetadata.getLatestCompactionTime() (line 808) uses getCommitAndReplaceTimeline() which also includes REPLACE_COMMIT_ACTION / CLUSTERING_ACTION, while this new code uses only getCommitTimeline() (just COMMIT_ACTION). Was this intentional? It's safe today since MDT only emits compaction commits as COMMIT_ACTION, but the inconsistency is a small future-proofing risk if MDT ever gains clustering/replace semantics — and reusing/sharing the existing helper would also avoid the duplicated lookup logic.

- AI-generated; verify before applying. React 👍/👎 to flag quality.

Copy link
Copy Markdown
Member

@voonhous voonhous May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Building ontop of this, if anything else ever writes a COMMIT_ACTION to MDT, this would silently treat that timestamp as a "compaction." It's worth being defensive, consider filtering explicitly on the compaction action.

IIRC, COMMIT_ACTION writes to MDT are exclusively generated by compaction, so this is safe for now.

The only problem that may arise in the future is if there's a change in contract API, and this becomes a regression.

As of now, i don't think this should be a blocker, just want to highlight this.

Feel free to create an issue to track this or if it's worth the effort to clean things up for new users to pick up.

mdtTestTable.addDeltaCommit("20260101020101000");

metaClient = HoodieTableMetaClient.reload(metaClient);
mdtMetaClient = HoodieTableMetaClient.reload(mdtMetaClient);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 nit: the comment is cut off mid-sentence — "rolled-back commits appear" appears to be missing the end of the thought. Could you complete it, e.g. "rolled-back commits appear in the valid timestamps"?

- AI-generated; verify before applying. React 👍/👎 to flag quality.

assertFalse(validTimestamps.contains(commit3), "commit3 should NOT be in valid timestamps (pre-compaction rollback skipped)");
}

private void addCompletedRollback(HoodieTestTable testTable, String rollbackTime, String rolledBackCommit) throws Exception {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 nit: emptyPartitionFiles reads as though the map is empty, but it actually contains a partition1 entry. Something like partitionFiles or partitionToFiles might be less surprising.

- AI-generated; verify before applying. React 👍/👎 to flag quality.

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 67.30%. Comparing base (47bf4e4) to head (67eac7a).

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18544      +/-   ##
============================================
- Coverage     68.14%   67.30%   -0.84%     
+ Complexity    29077    28665     -412     
============================================
  Files          2522     2522              
  Lines        141177   141185       +8     
  Branches      17514    17515       +1     
============================================
- Hits          96208    95028    -1180     
- Misses        37061    38148    +1087     
- Partials       7908     8009     +101     
Flag Coverage Δ
common-and-other-modules 44.42% <100.00%> (+<0.01%) ⬆️
hadoop-mr-java-client 45.01% <100.00%> (+<0.01%) ⬆️
spark-client-hadoop-common 21.36% <0.00%> (-26.99%) ⬇️
spark-java-tests 49.00% <100.00%> (+<0.01%) ⬆️
spark-scala-tests 44.91% <100.00%> (+<0.01%) ⬆️
utilities 37.65% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
.../apache/hudi/metadata/HoodieTableMetadataUtil.java 82.08% <100.00%> (-0.28%) ⬇️

... and 165 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@yihua yihua added this to the release-1.2.0 milestone May 13, 2026
Copy link
Copy Markdown
Member

@voonhous voonhous left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a non-blocking concern.

Let's fix the CI errors before we merge this in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M PR with lines of changes in (100, 300]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants