Skip to content

[HUDI-808] Support cleaning bootstrap source data#35

Open
zhedoubushishi wants to merge 1 commit into
bvaradar:cr_bootstrap_corefrom
zhedoubushishi:cr_bootstrap_core_5
Open

[HUDI-808] Support cleaning bootstrap source data#35
zhedoubushishi wants to merge 1 commit into
bvaradar:cr_bootstrap_corefrom
zhedoubushishi:cr_bootstrap_core_5

Conversation

@zhedoubushishi
Copy link
Copy Markdown

Tips

What is the purpose of the pull request

This is an important requirement from GDPR perspective. When performing deletion on a metadata only bootstrapped partition, users should have the ability to tell to clean up the original data from the source location because as per this new bootstrapping mechanism the original data serves as the data in original commit for Hudi.

Brief change log

Create an option named hoodie.cleaner.bootstrap.source.file, when set it to true, users have the ability to clean the original source data. By default, it is false.

Also added corresponding unit tests for this option in TestCleaner.java.

Verify this pull request

This change added tests and can be verified as follows:

  • Added TestCleaner to verify the change.

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant