Skip to content

load only required partitions for the table when partition filter is …#76

Open
vamsikarnika wants to merge 1 commit into
masterfrom
hudi-scope-fsv-to-pruned-partitions
Open

load only required partitions for the table when partition filter is …#76
vamsikarnika wants to merge 1 commit into
masterfrom
hudi-scope-fsv-to-pruned-partitions

Conversation

@vamsikarnika
Copy link
Copy Markdown

Description

Scope Hudi file system view loading to the set of pruned partitions instead of always loading every partition. When the Hudi metadata table is enabled, HudiSnapshotDirectoryLister previously called HoodieTableFileSystemView.loadAllPartitions() once per query, regardless of how many partitions actually matched the query predicate. For selective queries on tables with many partitions this dominates split-generation time.

This change introduces an opt-in session property scope_fsv_to_pruned_partitions (and matching config hudi.metadata.scope-fsv-to-pruned-partitions, default false) that switches the call to loadPartitions(prunedPaths), where prunedPaths is the set of partitions that survived partition pruning in HudiBackgroundSplitLoader. Behavior is unchanged when the flag is off.

Mechanically:

  • HudiBackgroundSplitLoader.generateSplits now publishes the pruned relative partition paths to the directory lister immediately after getPartitionInfos(), before any worker thread is submitted.
  • HudiSnapshotDirectoryLister reads the session flag at construction and, if both the flag is enabled and pruned paths have been set when the lazy file system view initializes, calls loadPartitions(...) instead of loadAllPartitions(). If the paths have not been set for any reason, it falls back to the existing loadAllPartitions() behavior.
  • HudiDirectoryLister gains a default no-op setPrunedPartitionPaths(List<String>) method, so the split loader can publish the paths without an instanceof check and without forcing other implementations to implement the hint.

The "Created file system view of table" log line now includes the pruned partition count when scoped loading is in effect, so the new path is easy to identify in the coordinator logs.

Additional context and related issues

In benchmarks against a Hudi 1.0 table with 365 partitions where the query predicate matched only 31 partitions, loadAllPartitions() accounted for roughly 85–90% of split-generation time (~5s on a 500k-row table, ~9s on a 1M-row table). The cost scales with total partition count and total file count in the metadata table files partition, not with the size of the pruned set, because the call eagerly materializes every partition's file slices through the metadata-backed file system view.

HoodieTableFileSystemView.loadPartitions(List<String>) is part of the public TableFileSystemView interface and is implemented by both AbstractTableFileSystemView and RemoteHoodieTableFileSystemView, so this change is portable across the file system view backends Hudi exposes.

The flag is intentionally off by default so existing deployments see no change. Once it has been validated in production-like workloads it can be promoted to the default. The non-metadata-table path (hudi.metadata-enabled=false) is untouched.

Other follow-ups that came out of the same investigation but are not part of this PR:

  • Caching the file system view across queries keyed by (table, latestCommitTime) so repeat queries skip FSV creation entirely.
  • Replacing HoodieLocalEngineContext with a parallel engine context so Hudi's internal engineContext.parallelize(...) calls in loadPartitions can actually parallelize HFile reads from the metadata table.

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## Hudi connector
* Add `hudi.metadata.scope-fsv-to-pruned-partitions` configuration property and matching `scope_fsv_to_pruned_partitions` session property to load only pruned partitions into the Hudi file system view during split generation. Disabled by default. When enabled, this can substantially reduce split-generation latency for selective queries on tables with many partitions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant