[ENG-27964] all blocking code change for hudi 1.1#13
Closed
Davis-Zhang-Onehouse wants to merge 3226 commits into
Closed
[ENG-27964] all blocking code change for hudi 1.1#13Davis-Zhang-Onehouse wants to merge 3226 commits into
Davis-Zhang-Onehouse wants to merge 3226 commits into
Conversation
…e#12781) Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>
1. fix generating file id with wrong bucket index Signed-off-by: TheR1sing3un <chaoyang@apache.org>
…g conditional writes (apache#12924)
…ier (apache#12695) Co-authored-by: Vova Kolmakov <kolmakov.vladimir@huawei.com> Co-authored-by: Vova Kolmakov <wombatukun@apache.org>
…ng bloom filter (apache#12919) * [HUDI-8768] Support bloom filter options when creating expr index using bloom filter * add index options validation in test * Refactoring and address more comments improve test * fix checkstyle * Update hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/feature/index/TestExpressionIndex.scala --------- Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>
…pRecordBuffer (apache#12925) * feat: pass compaction/merge related props to HoodieBaseFileGroupRecordBuffer 1. pass compaction/merge related props to HoodieBaseFileGroupRecordBuffer Signed-off-by: TheR1sing3un <chaoyang@apache.org> * fix: resolve multiple precombine-related configuration conflicts 1. resolve multiple precombine-related configuration conflicts 2. assume that precombine is based on table config Signed-off-by: TheR1sing3un <chaoyang@apache.org> * style: simplify lambda expression 1. simplify lambda expression Signed-off-by: TheR1sing3un <chaoyang@apache.org> * fix: fix the `record_key` and `_hoodie_record_key` are not mapped when the record is created in SparkDatasetTestUtils 1. fix the `record_key` and `_hoodie_record_key` are not mapped when the record is created in SparkDatasetTestUtils Signed-off-by: TheR1sing3un <chaoyang@apache.org> * rerun * feat: Remove the default value for PAYLOAD_ORDERING_FIELD_PROP_KEY to avoid taking the default value from props as a valid configuration 1. Remove the default value for PAYLOAD_ORDERING_FIELD_PROP_KEY to avoid taking the default value from props as a valid configuration优化排序字段获取逻辑,移除默认值配置。 Signed-off-by: TheR1sing3un <chaoyang@apache.org> * Update hudi-common/src/main/java/org/apache/hudi/common/util/collection/ExternalSpillableMap.java --------- Signed-off-by: TheR1sing3un <chaoyang@apache.org> Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>
…ns (apache#12929) * Avoid empty string rowkey to avoid failure of SimpleKeyGenerator initialization * Address comments * Address comments * Address comments * Add test for update * Fix issues --------- Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>
… schema (apache#12949) 1. Introduce JVM level caching for avro schema to reduce the cost of schema comparison. NOTE: Use cache to cache references to the schema on key links where the schema may be created repeatedly. This ensures that only one variable instance of the same Schema will be used during a JVM lifetime, thus reducing the overhead of schema comparison on important io paths. For most of the cases, we only need to compare whether it is the same reference, there is no need to call the `Schema::equals` method. 2. Cache the frequently reused Schema on the IO code path. --------- Signed-off-by: TheR1sing3un <chaoyang@apache.org>
* feat: introduce schema pruning for delete record NOTE: For the record we need to delete, we only need to read the `hoodie_meta_fields`, `record_keys` and the columns involved in the delete condition from the table, which can greatly reduce the amount of read data when deleting. --------- Signed-off-by: TheR1sing3un <chaoyang@apache.org>
…ileFormat constructor (apache#12981) * don't use tablestate for filegroup reader * revert change for multible base format --------- Co-authored-by: Jonathan Vexler <=>
…pache#12668) Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>
* introduce key filters to FG reader for full key/prefix key look up; * replace MDT reader path with FG reader. --------- Co-authored-by: danny0405 <yuzhao.cyz@gmail.com>
* Optimizing metadata getter for metadata table * Minor code cleanup --------- Co-authored-by: vinoth chandar <vinoth@apache.org>
…pty file…" (apache#13379) This reverts commit 2206326.
…es (apache#13347) * fix conflict handling for compaction given completion time changes * consolidate tests * split handling into two methods for ease of reading and debugging * extract common parts of the code
…link FileGroup reader (apache#13378)
Co-authored-by: Y Ethan Guo <ethan.guoyihua@gmail.com>
…table and Enabling Non Blocking Concurrency Control with Metadata (apache#13292) - Adding write config to support streaming writes to metadata table. Config is named "hoodie.metadata.streaming.write.enabled". - Enabling Non Blocking Concurrency Control with Metadata when streaming writes are enabled
… Java and Spark engines (apache#13361)
apache#13387) * [MINOR] Renaming TransactionManager methods to begin/end x StateChange - begin/end Transaction is confusing. - Naming aligns with how these methods are called, whenever action state changes * Log message cleanup
…tween RowData and Avro Record (apache#13390)
…3305) * add isMetadataTable flag in WriteStaus; * fixing WriteStats to accomodate metadata table as well. --------- Co-authored-by: sivabalan <n.siva.b@gmail.com> Co-authored-by: danny0405 <yuzhao.cyz@gmail.com>
…rt (apache#13360) * refactor: Unify all the code paths of bulk insert operations --------- Signed-off-by: TheR1sing3un <chaoyang@apache.org>
…sion based on pom (apache#13396)
473fe1b to
a7ca043
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Change Logs
Describe context and summary for this change. Highlight if any code was copied.
Impact
Describe any public API or user-facing feature change or any performance impact.
Risk level: none | low | medium | high
Choose one. If medium or high, explain what verification was done to mitigate the risks.
Contributor's checklist