Skip to content

feat: Add hudi-azure-bundle#18472

Open
linliu-code wants to merge 2 commits into
apache:masterfrom
linliu-code:add_azure_bundle
Open

feat: Add hudi-azure-bundle#18472
linliu-code wants to merge 2 commits into
apache:masterfrom
linliu-code:add_azure_bundle

Conversation

@linliu-code
Copy link
Copy Markdown
Collaborator

@linliu-code linliu-code commented Apr 6, 2026

Describe the issue this Pull Request addresses

#18471

When running Hudi Spark jobs on Azure (ADLS Gen2), the Azure Storage SDK's Netty and Reactor dependencies conflict with Spark's bundled Netty, causing runtime NoSuchMethodError and StacklessClosedChannelException during lock acquisition. Specifically, reactor-netty-http calls HttpClientCodec.<init>(HttpDecoderConfig, boolean, boolean) — a constructor that only exists in Netty 4.1.94+ — but Spark's older Netty HttpClientCodec is loaded instead. This makes the Azure-based StorageBasedLockProvider (added in #17951) unusable in Spark environments.

Additionally, there is no pre-built bundle for Azure dependencies analogous to hudi-aws-bundle and hudi-gcp-bundle, forcing users to manually manage Azure SDK, Reactor, and Netty jars on the classpath.

Summary and Changelog

  • Extends the hudi-azure-bundle module (packaging/hudi-azure-bundle) into a shaded fat jar that packages all Azure-specific dependencies (Azure SDK, Azure identity deps, Reactor + reactor-netty, Netty, Reactive Streams) into a single self-contained artifact, following the same pattern as hudi-aws-bundle and hudi-gcp-bundle.
  • Added shading relocations for Netty/Reactor isolation: io.netty.*, io.projectreactor.*, reactor.*, and org.reactivestreams.* are relocated under org.apache.hudi.* to eliminate classpath conflicts with Spark's bundled Netty.
  • Added Azure identity dependencies (com.nimbusds:*, net.minidev:*) so DefaultAzureCredential and related auth providers work out-of-the-box.
  • Added hbase-webapps/** filter exclude, src/test/resources resource directory, and an Avro compile dependency.

Note: This PR was rebased onto current master. The original AzureStorageLockClient commits were dropped because master already has an implementation of that class via #17951; this PR now contains only the bundle module additions on top of master's existing hudi-azure-bundle skeleton.

Impact

  • Enables Hudi's StorageBasedLockProvider to work reliably on Azure/ADLS Gen2 in Spark environments.
  • Eliminates the need to manually place reactor-netty, reactor-core, reactive-streams, and netty-resolver-dns jars on the Spark classpath — everything is self-contained in the bundle.
  • No impact on existing AWS or GCP bundles — reactor-netty is only included in the Azure bundle.

Risk Level

Low

  • This PR is purely additive — it only changes packaging/hudi-azure-bundle/pom.xml (the module skeleton master already added) to include and relocate the extra dependencies.
  • Shading Netty with relocation is a well-established pattern (used by HBase, gRPC, Snowflake, DataHub in this same codebase).
  • One area to monitor: Netty native transports (epoll/kqueue) reference class names via JNI — however, Azure SDK's HTTP client uses reactor-netty which works with NIO transport, so relocated Netty classes function correctly.

Documentation Update

none

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

@linliu-code linliu-code changed the title feat: Add azure bundle feat: Add hudi-azure-bundle Apr 6, 2026
@linliu-code linliu-code marked this pull request as ready for review April 6, 2026 17:21
@github-actions github-actions Bot added the size:L PR with lines of changes in (300, 1000] label Apr 6, 2026
Copy link
Copy Markdown
Contributor

@yihua yihua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.

Style & Readability Review — One code reuse issue: URI parsing and container validation logic is duplicated between readObject() and writeObject() methods.

} else {
logger.error("Error reading JSON config file: {}", filePath, e);
}
return Option.empty();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 nit: this URI parsing and container validation (lines 288–297) is duplicated from readObject(). Could you extract into a private helper method?

Copy link
Copy Markdown
Contributor

@yihua yihua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.

Thanks for contributing! The overall structure of AzureStorageLockClient is clean and follows the S3/GCS pattern well. There's one functional bug worth addressing in the ETag handling before merging: the quote-stripping in readCurrentLockFile creates an inconsistency with the ETag format expected by BlobRequestConditions.setIfMatch(), which will break the "take over expired lock" scenario. Details in the inline comment.

String eTag = response.getHeaders().getValue("ETag");
if (eTag != null) {
// Azure returns ETags wrapped in quotes, remove them
eTag = eTag.replaceAll("^\"|\"$", "");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 This quote-stripping creates an ETag format inconsistency that breaks conditional writes for the expired-lock takeover path. BlockBlobItem.getETag() (used in createOrUpdateLockFileInternal) returns the ETag WITH surrounding double-quotes (e.g. "0x8D4A"), which is exactly what BlobRequestConditions.setIfMatch() expects — it passes the value directly to the If-Match header, so the quotes must be present for a valid HTTP conditional request. By stripping them here, any setIfMatch call using an ETag sourced from readCurrentLockFile sends If-Match: 0x8D4A (unquoted), while a call using an ETag from a write sends If-Match: "0x8D4A" (correctly quoted). Removing the replaceAll and keeping the raw header value would align both paths and avoid this failure mode.

@github-actions github-actions Bot added size:S PR with lines of changes in (10, 100] and removed size:L PR with lines of changes in (300, 1000] labels May 15, 2026
Copy link
Copy Markdown
Contributor

@hudi-agent hudi-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 This review was generated by an AI agent and may contain mistakes. Please verify any suggestions before applying.

Thanks for the contribution! This PR turns hudi-azure-bundle into a proper shaded fat jar with Netty/Reactor relocation, mirroring the AWS and GCP bundles. One question on a build-config divergence from the sibling bundles worth a second look. Please take a look at the inline comment, and this should be ready for a Hudi committer or PMC member to take it from here. A couple of spots where a short explanatory comment would save future readers some head-scratching.

<resource>
<directory>src/main/resources</directory>
</resource>
<resource>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Is there a reason to add src/test/resources to the main <resources> block? Maven's standard layout already exposes it for the test phase, and adding it here will package any test resources into the production shaded jar. The sibling hudi-aws-bundle and hudi-gcp-bundle only declare src/main/resources. Could you drop this entry unless something specifically requires it?

- AI-generated; verify before applying. React 👍/👎 to flag quality.

<exclude>META-INF/*.RSA</exclude>
<exclude>META-INF/services/javax.*</exclude>
<exclude>**/*.proto</exclude>
<exclude>hbase-webapps/**</exclude>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 nit: it might be worth adding a brief inline comment here noting which transitive dependency pulls in hbase-webapps/ — without it, a future reader will wonder why HBase UI resources are being excluded from an Azure bundle.

- AI-generated; verify before applying. React 👍/👎 to flag quality.

<directory>src/main/resources</directory>
</resource>
<resource>
<directory>src/test/resources</directory>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 nit: including src/test/resources in the main build resources is unusual — could you add a comment explaining why it's needed here (e.g. which runtime config files live there that the bundle requires)?

- AI-generated; verify before applying. React 👍/👎 to flag quality.

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.13%. Comparing base (4035f70) to head (e2d8a0a).

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18472      +/-   ##
============================================
- Coverage     68.14%   68.13%   -0.01%     
+ Complexity    29105    29098       -7     
============================================
  Files          2518     2518              
  Lines        141221   141221              
  Branches      17534    17534              
============================================
- Hits          96235    96226       -9     
- Misses        37070    37075       +5     
- Partials       7916     7920       +4     
Flag Coverage Δ
common-and-other-modules 44.40% <ø> (+<0.01%) ⬆️
hadoop-mr-java-client 45.00% <ø> (-0.01%) ⬇️
spark-client-hadoop-common 48.32% <ø> (+<0.01%) ⬆️
spark-java-tests 48.97% <ø> (+0.05%) ⬆️
spark-scala-tests 44.89% <ø> (-0.01%) ⬇️
utilities 37.62% <ø> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 12 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hudi-bot
Copy link
Copy Markdown
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:S PR with lines of changes in (10, 100]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants