Skip to content

fix(clean): address review comments on empty clean support (#18337)#18587

Merged
yihua merged 5 commits into
apache:masterfrom
yihua:address-empty-clean-review-comments
Apr 25, 2026
Merged

fix(clean): address review comments on empty clean support (#18337)#18587
yihua merged 5 commits into
apache:masterfrom
yihua:address-empty-clean-review-comments

Conversation

@yihua
Copy link
Copy Markdown
Contributor

@yihua yihua commented Apr 24, 2026

Describe the issue this Pull Request addresses

Addresses review comments on #18337 (empty clean support). Renames config constants, builder methods, and getter to follow consistent naming, fixes a typo in the config key, and improves ECTR (earliestCommitToRetain) handling when cleaner configuration changes.

Summary and Changelog

  • Renamed MAX_INTERVAL_TO_CREATE_EMPTY_CLEAN_HOURS to INTERVAL_TO_CREATE_EMPTY_CLEAN_HOURS
  • Fixed config key typo: hoodie.write.empty.clean.internval.hourshoodie.write.empty.clean.interval.hours
  • Renamed builder method to withIntervalToCreateEmptyCleanHours
  • Renamed getter to getIntervalHoursToCreateEmptyClean
  • Renamed local variable maxDurationHours to emptyCleanIntervalHours
  • Changed ECTR backwards-check behavior: instead of skipping empty clean creation when ECTR would go backwards (due to config changes), the code now adjusts the ECTR to the previous value and proceeds with the empty clean. This ensures empty cleans are still created to avoid full table scans, while preventing ECTR regression.

Impact

No public API changes. Config key spelling fix (internvalinterval) is safe since the original PR (#18337) has not been released yet.

Risk Level

none

Documentation Update

none

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

yihua added 2 commits April 24, 2026 12:38
- Rename config constant to INTERVAL_TO_CREATE_EMPTY_CLEAN_HOURS
- Fix config key typo: internval -> interval
- Rename builder/getter/parameter for consistency
- Adjust ECTR to previous value instead of skipping empty clean when
  it would go backwards due to config changes
Rename getIntervalHoursToCreateEmptyClean to
getIntervalToCreateEmptyCleanHours to match the builder method
withIntervalToCreateEmptyCleanHours and codebase convention.
Copy link
Copy Markdown
Contributor

@hudi-agent hudi-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.

Thanks for the follow-up! This PR renames the empty-clean config constants/methods for consistency, fixes a config key typo (internvalinterval), and changes the ECTR-regression path to adjust rather than skip the empty clean. No correctness issues found. A few style/readability suggestions in the inline comments. Please take a look, and this should be ready for a Hudi committer or PMC member to take it from here. One small nit on the validation error message — otherwise the renaming and typo fix look clean.

cc @yihua

The error now states that -1 (disabled) is also a valid value,
avoiding confusion for users who see only "must be >= 1".
@github-actions github-actions Bot added the size:S PR with lines of changes in (10, 100] label Apr 24, 2026
yihua and others added 2 commits April 24, 2026 12:54
The test was going from 24h->12h retention (decreasing), which makes
ECTR move forward. The backwards check never triggered; the test
passed because the null ECTR short-circuited the empty clean path.

Fix: go from 12h->72h retention (increasing), which makes ECTR move
backwards and actually exercises the adjustment logic. Assert that
the second empty clean IS created with ECTR pinned to the previous
value.
@hudi-bot
Copy link
Copy Markdown
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 81.81818% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.89%. Comparing base (2059c11) to head (f455f68).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
...java/org/apache/hudi/config/HoodieCleanConfig.java 66.66% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18587      +/-   ##
============================================
- Coverage     68.89%   68.89%   -0.01%     
+ Complexity    28549    28536      -13     
============================================
  Files          2480     2480              
  Lines        136904   136904              
  Branches      16673    16676       +3     
============================================
- Hits          94324    94321       -3     
+ Misses        34994    34993       -1     
- Partials       7586     7590       +4     
Flag Coverage Δ
common-and-other-modules 44.42% <36.36%> (-0.01%) ⬇️
hadoop-mr-java-client 44.74% <36.36%> (+<0.01%) ⬆️
spark-client-hadoop-common 48.52% <81.81%> (+<0.01%) ⬆️
spark-java-tests 49.48% <36.36%> (-0.01%) ⬇️
spark-scala-tests 45.28% <36.36%> (+<0.01%) ⬆️
utilities 37.98% <36.36%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...java/org/apache/hudi/config/HoodieWriteConfig.java 89.97% <100.00%> (ø)
...di/table/action/clean/CleanPlanActionExecutor.java 87.68% <100.00%> (+2.17%) ⬆️
...java/org/apache/hudi/config/HoodieCleanConfig.java 92.92% <66.66%> (ø)

... and 10 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@yihua yihua merged commit c1569db into apache:master Apr 25, 2026
105 of 109 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S PR with lines of changes in (10, 100]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants