Conversation
📝 WalkthroughWalkthroughThis PR introduces Gluten/Velox native execution support to Hudi test infrastructure, adds comprehensive merge-on-read (MOR) test suites with complex data types, documents three failing MOR test cases, disables unsupported test scenarios (custom merge modes, custom payloads), and updates Maven configuration to pull specialized Spark dependencies. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Greptile SummaryThis PR introduces Gluten/Velox native-execution support into Hudi's functional test suite (MOR read path), driven by an investigation into three failing sub-tests ( However, the PR contains several OSS-blocking artifacts from the investigation process:
Confidence Score: 1/5Not safe to merge — the PR contains private build infrastructure that breaks the OSS build, a production debug log statement, and hardcoded developer paths that will fail tests on any other machine. Score reflects three blocking P1 issues: (1) the internal Onehouse Spark version and private CodeArtifact Maven repository in root pom.xml make the project unbuildable for all external contributors, (2) the logWarning in production HoodieSparkUtils.scala prints the full query plan at WARN level on every MOR read, and (3) two new test classes write to hardcoded absolute paths that only exist on one developer's machine. pom.xml (private repo + internal Spark version), HoodieSparkUtils.scala (debug log), TestMORFileSliceLayouts.scala and TestMergeModeCommitTimeOrdering.scala (hardcoded paths), RCA_Gluten_Velox_MOR_Failures.md (internal document) Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Test JVM starts] --> B{gluten.bundle.jar\nsystem property set?}
B -- No --> C[Standard Spark execution]
B -- Yes --> D[GlutenTestUtils.applyGlutenConf]
D --> E{QuantonPlugin\non classpath?}
E -- Yes --> F[pluginClass =\nai.onehouse.quanton.QuantonPlugin]
E -- No --> G[pluginClass =\norg.apache.gluten.GlutenPlugin]
F --> H[Apply spark.plugins +\noffheap + shuffle manager]
G --> H
H --> I[SparkSession with\nGluten/Velox native execution]
C --> J[SparkSession with\nstandard JVM execution]
I --> K[MOR Read Path\nHoodieSparkUtils.createRdd]
J --> K
K --> L[df.queryExecution.toRdd\n.mapPartitions]
L --> M{Gluten active?}
M -- Yes --> N[WholeStageCodegenPartitionEvaluator\nrows.isEmpty triggers processNext]
M -- No --> O[Standard iterator\nrows.isEmpty safe]
N --> P[RCA: UnsafeRow sizeInBytes\nmust be multiple of 8]
|
| // Additionally, we have to explicitly wrap around resulting [[RDD]] into the one | ||
| // injecting [[SQLConf]], which by default isn't propagated by Spark to the executor(s). | ||
| // [[SQLConf]] is required by [[AvroSerializer]] | ||
| logWarning(s"createRdd executedPlan:\n${df.queryExecution.executedPlan.treeString}") |
There was a problem hiding this comment.
Debug warning log left in production code
This logWarning call prints the entire executed query plan (as a tree string) at WARN level every time createRdd is called. This is clearly a debug statement added during the Gluten/Velox investigation and must not reach production:
- It will spam the logs for every MOR read in every Hudi job.
treeStringis non-trivial to compute and allocates for every partition scan.- WARN level means it will appear in most production logging configurations.
This line should be removed entirely before merging.
| @@ -170,7 +170,7 @@ | |||
| <rocksdbjni.version>7.5.3</rocksdbjni.version> | |||
There was a problem hiding this comment.
Internal Onehouse Spark build version in OSS project
The Spark 3.5 version has been changed from the public 3.5.5 to 3.5.2-INTERNAL-0.16, an Onehouse-internal build that is only resolvable via the private CodeArtifact repository added below (line ~1966). This will break the build for every developer and CI environment that does not have access to that repository.
This must be reverted to an official Apache Spark release before merging to the OSS project.
| <id>confluent</id> | ||
| <url>https://packages.confluent.io/maven/</url> | ||
| </repository> | ||
| <repository> | ||
| <id>codeartifact</id> | ||
| <name>codeartifact</name> |
There was a problem hiding this comment.
Private internal Maven repository in OSS project
The repository at https://onehouse-194159489498.d.codeartifact.us-west-2.amazonaws.com/maven/onehouse-internal/ is a private AWS CodeArtifact instance owned by Onehouse. Adding it to the root pom.xml means every Maven build of the project will attempt to resolve artifacts from a private, unauthenticated endpoint — this will fail for all external contributors and CI systems outside of Onehouse infrastructure.
Both this repository entry and the internal Spark version (3.5.2-INTERNAL-0.16) that depends on it must be removed.
| val hadoopPath = new Path(path) | ||
| val fs = hadoopPath.getFileSystem(spark.sessionState.newHadoopConf()) | ||
| if (fs.exists(hadoopPath)) { | ||
| fs.delete(hadoopPath, true) |
There was a problem hiding this comment.
Hardcoded developer-local path used as base output directory
val BASE_OUTPUT_PATH = "/home/ubuntu/ws3/hudi-rs/tables/fdss" is a hardcoded absolute path on a specific developer's machine. All four tests in this file write their tables to sub-directories of this path. These tests will fail (or worse, silently write data to an unexpected location) on any other machine.
This should be replaced with a temporary directory, consistent with how other test classes in the project handle isolation, e.g.:
| fs.delete(hadoopPath, true) | |
| val BASE_OUTPUT_PATH = createTempDir().getAbsolutePath |
Or use the withTempDir pattern already used throughout HoodieSparkSqlTestBase.
| ) { | ||
| withRecordType()(withTempDir { tmp => | ||
| withRecordType()({ | ||
| val tableName = generateTableName | ||
| val currTime = System.currentTimeMillis() |
There was a problem hiding this comment.
Hardcoded developer-local path and global mutable state
Two related problems here:
-
Hardcoded path:
s"/home/ubuntu/ws3/hudi-rs/tables/table${TestMergeModeCommitTimeOrdering.cnt}_$currTime"is a hardcoded path on a specific developer's machine. Tests will fail on any other CI or developer environment. -
Global mutable
cntin companion object: The companion objectTestMergeModeCommitTimeOrdering.cntis a JVM-singletonvarincremented across test invocations. This is a form of shared mutable state that makes test execution order-dependent and could cause path collisions or unexpected behavior if tests are run in parallel.
Both issues stem from the same root cause — the code was modified to write tables to a fixed developer path for manual inspection with hudi-rs. This logic should be reverted to use withTempDir { tmp => ... } as in the original code.
The same pattern appears in the second test block in this file (~line 255).
| // First Operation: | ||
| // Producing parquet files to three default partitions. | ||
| // SNAPSHOT view on MOR table with parquet files only. |
There was a problem hiding this comment.
Leftover commented-out developer debug path
These two commented-out lines are clearly debugging artifacts from manual testing with hudi-rs:
// val BASE_OUTPUT_PATH = "/home/ubuntu/ws3/hudi-rs/tables/testPrunedFiltered"
// val tablePath = s"$BASE_OUTPUT_PATH/table_log_compaction_${System.currentTimeMillis()}"They should be removed entirely from the file before merging.
| # RCA: Gluten/Velox MOR Test Failures | ||
|
|
||
| Three sub-tests fail when Gluten/Velox native execution is enabled: | ||
| `testMORWithMap`, `testMORWithDecimal`, `testMORWithArray`. | ||
|
|
There was a problem hiding this comment.
Internal investigation document should not be committed to the repository
RCA_Gluten_Velox_MOR_Failures.md is a root-cause analysis document for a specific Gluten/Velox testing investigation. It references internal stack traces, log evidence, and Onehouse-specific infrastructure. It is not part of the project's documentation and will confuse future readers of the repository.
This file should not be committed to the OSS repository. If it needs to be preserved for reference, it belongs in an internal wiki or issue tracker rather than the source tree.
There was a problem hiding this comment.
Actionable comments posted: 15
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestPartialUpdateForMergeInto.scala (1)
683-730:⚠️ Potential issue | 🟠 MajorDon't ignore the regression test for unsupported CUSTOM MERGE INTO.
This was the assertion that
CUSTOMpayload +CUSTOMmerge mode is rejected for MERGE INTO. Converting it toignoreremoves that contract entirely. If the skip is only needed for the Gluten/RS path, make it conditional and keep the default suite validating the failure.hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestMergeModeEventTimeOrdering.scala (1)
34-39:⚠️ Potential issue | 🟠 MajorDon't remove the v6/v8 compatibility matrix from the shared suite.
The test body still has version-specific expectations for table versions 6 and 8, but those cases are now unreachable. That turns legacy merge-mode behavior into dead, unverified code rather than isolating the Gluten-specific failures.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestMergeModeEventTimeOrdering.scala` around lines 34 - 39, The parametrized Seq in TestMergeModeEventTimeOrdering currently omits the legacy v6/v8 table-version cases, making the version-specific expectations in the test body unreachable; update the Seq(...) used where args is iterated to reintroduce the v6 and v8 compatibility rows (so the test runs the legacy table-version permutations) ensuring the strings/tuples you add match the shape consumed by the test loop (the same format used by the other entries) so the existing version-specific assertions in the test body execute for v6 and v8.hudi-common/src/test/java/org/apache/hudi/common/table/read/TestFileGroupReaderSchemaHandler.java (1)
168-175:⚠️ Potential issue | 🟡 MinorTest coverage significantly reduced.
The
@CsvSourcehas been trimmed to remove:
- All
SIXtable version test cases (backwards compatibility)CUSTOMmerge mode scenarios- Several
populateMetaFields=falsecombinationsThis reduces coverage for pre-v9 table compatibility. Consider documenting which scenarios are intentionally excluded and why, or tracking re-enablement in a follow-up issue.
Would you like me to open an issue to track re-enabling these test cases once the underlying features are supported?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hudi-common/src/test/java/org/apache/hudi/common/table/read/TestFileGroupReaderSchemaHandler.java` around lines 168 - 175, The CSV-driven test in TestFileGroupReaderSchemaHandler lost coverage: restore the removed `@CsvSource` rows that include table version SIX, merge mode CUSTOM, and the populateMetaFields=false permutations so pre-v9/backwards-compat scenarios are tested again; if those cases must remain excluded temporarily, add a concise TODO comment next to the `@CsvSource` explaining why they are omitted and add/mention a tracking issue ID (create one if none) so re-enabling (or supporting the underlying features) is tracked.hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestMORDataSource.scala (1)
107-122:⚠️ Potential issue | 🟠 MajorKeep the reduced compatibility matrix behind a Gluten-only gate.
These edits permanently drop older-version/custom-merge cases from the default MOR suite, even though native execution is only turned on when the external bundle is present. That shrinks baseline Spark coverage instead of isolating the Gluten-specific limitations to the Gluten profile.
Also applies to: 568-570, 1142-1143, 1531-1534
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestMORDataSource.scala` around lines 107 - 122, The test suite in TestMORDataSource.scala has had many CSV rows removed from the `@CsvSource` for the parameterized test, which unintentionally shrinks baseline Spark coverage; instead, restore the full compatibility matrix in the `@CsvSource` and run the reduced Gluten-only matrix conditionally by gating it behind the Gluten profile/flag (e.g., detect the Gluten/external bundle flag used in test runtime or use an `@EnabledIf/`@Tag that checks isGlutenEnabled). Specifically, revert the commented-out rows in the `@CsvSource` inside the parameterized test and add logic around test execution (or a second CsvSource variant) so that only when the Gluten flag is true you use the reduced set; reference the parameterized test in TestMORDataSource.scala and ensure the same gating is applied to the other affected CsvSource occurrences noted (around the same test blocks at the other locations).hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderOnSpark.scala (1)
76-106:⚠️ Potential issue | 🟠 MajorRestore
supportedFileFormatsinteardown().This symbol comes from
TestHoodieFileGroupReaderBaseand is shared mutable state. Forcing it toPARQUETinsetup()without restoring it makes later tests/classes order-dependent.Suggested isolation fix
class TestHoodieFileGroupReaderOnSpark extends TestHoodieFileGroupReaderBase[InternalRow] with SparkAdapterSupport { var spark: SparkSession = _ + private var previousSupportedFileFormats: java.util.List[HoodieFileFormat] = _ `@BeforeEach` def setup() { + previousSupportedFileFormats = supportedFileFormats val sparkConf = new SparkConf sparkConf.set("spark.app.name", getClass.getName) ... spark = SparkSession.builder.config(sparkConf).getOrCreate supportedFileFormats = util.Arrays.asList(HoodieFileFormat.PARQUET) } `@AfterEach` def teardown() { - if (spark != null) { - spark.stop() - } + try { + if (spark != null) { + spark.stop() + } + } finally { + supportedFileFormats = previousSupportedFileFormats + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderOnSpark.scala` around lines 76 - 106, The test mutates the shared mutable symbol supportedFileFormats in setup(), making tests order-dependent; capture the original value before overwriting (e.g., val previousSupported = supportedFileFormats) inside setup() and then restore it in teardown() (supportedFileFormats = previousSupported) so the shared state from TestHoodieFileGroupReaderBase is returned to its prior value; implement these changes in the setup and teardown methods that currently set supportedFileFormats = util.Arrays.asList(HoodieFileFormat.PARQUET).
🧹 Nitpick comments (5)
hudi-spark-datasource/hudi-spark/src/test/java/org/apache/hudi/table/functional/TestHoodieSparkMergeOnReadTableRollback.java (1)
110-110: Parameterization is now effectively single-version and should be simplified.At Line 110 and Line 179, the
@CsvSourcenow passes only table version9, but the test methods still carry atableVersionparameter and version-dependent setup paths. Consider either restoring a true matrix or collapsing these tests to a fixed v9 path to avoid misleading maintenance surface.♻️ Suggested simplification (apply similarly to both methods)
- `@ParameterizedTest` - `@CsvSource`({ "true,9", "false,9"}) - void testCOWToMORConvertedTableRollback(boolean rollbackUsingMarkers, int tableVersion) throws Exception { + `@ParameterizedTest` + `@ValueSource`(booleans = {true, false}) + void testCOWToMORConvertedTableRollback(boolean rollbackUsingMarkers) throws Exception { + final int tableVersion = 9;Also applies to: 179-179
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hudi-spark-datasource/hudi-spark/src/test/java/org/apache/hudi/table/functional/TestHoodieSparkMergeOnReadTableRollback.java` at line 110, The tests in TestHoodieSparkMergeOnReadTableRollback are still written to accept a tableVersion parameter even though the CsvSource only supplies v9; change the parameterization to only vary the boolean and collapse version handling to a fixed v9: update the two test annotations (the ones currently using "@CsvSource({ \"true,9\", \"false,9\"})") to supply just the boolean, remove the tableVersion parameter from the test method signatures, and replace any use of tableVersion (including conditional path/build logic) with the fixed v9 value (or a single v9 constant) so the tests no longer carry dead version branches.hudi-tests-common/src/main/resources/log4j2-surefire-quiet.properties (1)
27-31: Consider setting log level towarnfor consistency with quiet profile.This is the "quiet" surefire config where the root logger is
warnand other loggers likehbaseareerror. Settingorg.apache.glutenandai.onehouse.quantontoinfomay produce more verbose output than expected in quiet mode.Suggested change for quiet profile
logger.gluten.name = org.apache.gluten -logger.gluten.level = info +logger.gluten.level = warn logger.quanton.name = ai.onehouse.quanton -logger.quanton.level = info +logger.quanton.level = warn🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hudi-tests-common/src/main/resources/log4j2-surefire-quiet.properties` around lines 27 - 31, Change the log levels for the quiet surefire profile: set logger.gluten.level and logger.quanton.level from info to warn so org.apache.gluten and ai.onehouse.quanton follow the quiet profile (root warn) and avoid producing verbose test output; update the entries for logger.gluten.name and logger.quanton.name accordingly.hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestMergeModeCommitTimeOrdering.scala (2)
38-39: COW test variants commented out.All COW test cases are commented out, leaving only MOR variants. If this is intentional for Gluten/Velox compatibility, consider adding a comment explaining why, or use
@Disabledannotations with reasons instead of commenting out code.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestMergeModeCommitTimeOrdering.scala` around lines 38 - 39, The COW variants in TestMergeModeCommitTimeOrdering have been commented out (the parameter list containing "cow,current,..." is removed) leaving only MOR cases; either restore the COW entries in the args list inside TestMergeModeCommitTimeOrdering so the tests run, or instead mark the COW cases explicitly disabled with a clear reason: add `@Disabled`("reason: Gluten/Velox incompatibility") on the specific test or branch and include an inline comment explaining why COW is excluded; locate the parameter list passed to args.foreach and the test method in class TestMergeModeCommitTimeOrdering to apply the change.
31-33: Remove mutable state from companion object.Using a mutable
varin a companion object for test uniqueness is fragile. Consider usingUUID.randomUUID()or the test framework's temp directory facilities instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestMergeModeCommitTimeOrdering.scala` around lines 31 - 33, The companion object TestMergeModeCommitTimeOrdering currently defines a mutable var cnt used for uniqueness; remove this mutable state and replace all uses of TestMergeModeCommitTimeOrdering.cnt with a deterministic, thread-safe alternative such as UUID.randomUUID().toString (or the test framework's temp directory / unique test name helper) wherever unique names/paths are constructed (search for references to TestMergeModeCommitTimeOrdering and cnt in the tests); update any setup/teardown logic to not rely on incrementing cnt and ensure uniqueness is produced inline at call sites (e.g., generate a UUID or use temporaryFolder/newTempDir) instead.hudi-common/src/test/java/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderBase.java (1)
359-362: CUSTOM merge mode tests commented out.Consider using
@Disabledannotations with reasons instead of commenting out test parameters. This provides better documentation and makes it easier to track what needs re-enabling.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hudi-common/src/test/java/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderBase.java` around lines 359 - 362, Uncomment the two parameter lines that use RecordMergeMode.CUSTOM (the commented arguments(...) calls) and instead of leaving them commented, disable the associated parameterized test using `@Disabled` with a short reason; locate the parameter provider or the test method in TestHoodieFileGroupReaderBase that consumes these arguments (the method that returns or uses arguments(RecordMergeMode..., HoodieFileFormat..., ...)) and add `@Disabled`("CUSTOM merge mode tests temporarily disabled - <reason>") to that test or its provider so the CUSTOM cases are recorded as intentionally disabled rather than commented out.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/HoodieSparkUtils.scala`:
- Line 106: The current logWarning in HoodieSparkUtils.createRdd eagerly
materializes df.queryExecution.executedPlan.treeString on a hot write path;
change it to avoid WARN-level plan dumps by either moving the message to
logDebug or gating it behind a diagnostic flag. Locate the logWarning call
referencing df.queryExecution.executedPlan.treeString in createRdd and replace
it so the plan string is only generated when logDebug is enabled or when a new
boolean diagnostic flag (e.g., enablePlanDump) is true; ensure you check
logger.isDebugEnabled (or the flag) before calling treeString to prevent
unnecessary work.
In
`@hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/GlutenTestUtils.java`:
- Around line 42-54: The applyGlutenConf method currently treats the
gluten.bundle.jar string as a flag; update it to fail-fast by validating the
path and plugin class before mutating SparkConf: 1) check that the
glutenBundleJar string is non-empty and points to an existing readable file
(e.g., new File(glutenBundleJar).isFile()) and if not, throw an
IllegalArgumentException or return early so SparkConf is not modified; 2)
resolve the plugin class by first trying
Class.forName("ai.onehouse.quanton.QuantonPlugin") and if that fails set
pluginClass = "org.apache.gluten.GlutenPlugin", then also verify the fallback
exists via Class.forName(pluginClass) and return/throw if it does not; 3) only
after both the jar file and the chosen pluginClass are validated, set the
SparkConf entries (the code that writes to sparkConf) so misconfigured paths or
missing classes are detected immediately in applyGlutenConf.
In
`@hudi-common/src/test/java/org/apache/hudi/common/table/read/buffer/TestKeyBasedFileGroupRecordBuffer.java`:
- Around line 181-183: The `@Disabled` reason on the test method
readWithCommitTimeOrdering is misleading; update the disable annotation or the
test name so the intent is clear: either change the `@Disabled` message to
explicitly state "Uses custom delete payload (not supported)" or rename the
method (e.g., to readWithCommitTimeOrderingAndCustomDeletePayload) to reflect
that the failure is due to custom delete payloads, not commit-time ordering.
Locate the test method readWithCommitTimeOrdering and apply one of these changes
so the disable reason and test identifier align.
In
`@hudi-common/src/test/java/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderBase.java`:
- Around line 451-452: The test parameterization was narrowed to only the "avro"
log data block format, removing "parquet" coverage; update the parameter list
used by the test (in TestHoodieFileGroupReaderBase, the method supplying
arguments for logFileOnlyCases) to include both "avro" and "parquet" again
(e.g., add corresponding arguments(RecordMergeMode.COMMIT_TIME_ORDERING,
"parquet") and arguments(RecordMergeMode.EVENT_TIME_ORDERING, "parquet") so the
test runs for both formats).
In
`@hudi-spark-datasource/hudi-spark/src/test/java/org/apache/hudi/io/TestMergeHandle.java`:
- Around line 280-283: The test removed CUSTOM/CUSTOM_MERGER branch coverage;
restore explicit coverage by including "CUSTOM" and "CUSTOM_MERGER" in the
`@ValueSource` for testFGReaderBasedMergeHandleInsertUpsertDelete or, if those
modes are genuinely unsupported under the Gluten environment, add a guard inside
testFGReaderBasedMergeHandleInsertUpsertDelete (calling
testFGReaderBasedMergeHandleInsertUpsertDeleteInternal) that detects the Gluten
run (e.g., an existing isGluten()/environment check) and asserts the unsupported
behavior for CUSTOM/CUSTOM_MERGER instead of silently skipping them; update
references to testFGReaderBasedMergeHandleInsertUpsertDeleteInternal, the
ValueSource annotation, and any Gluten-run detection logic accordingly.
In
`@hudi-spark-datasource/hudi-spark/src/test/java/org/apache/hudi/table/functional/TestHoodieSparkMergeOnReadTableRollback.java`:
- Line 1161: Replace the hard-disabled annotations so the rollback failure-path
tests run conditionally: in TestHoodieSparkMergeOnReadTableRollback.java and
TestHoodieSparkCopyOnWriteTableRollbackTableVersionSix.java, change each method
annotated with `@Disabled`("Table version 6 only") to `@Tag`("table-version-6") (the
four rollback test methods around the existing markers). Then add/extend a Maven
profile (e.g., functional-tests-v6) in pom.xml to include the table-version-6
tag via the surefire/failsafe groups configuration, and ensure the Azure CI
pipeline triggers that profile in the appropriate validation job so the
table-version-6 tests execute in CI.
In
`@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderOnSpark.scala`:
- Around line 677-678: The test is permanently disabled with `@Disabled`("Custom
delete payload not supported"), removing custom-delete coverage for all runs;
instead remove the `@Disabled` and add a conditional skip that only disables the
test when native/Gluten execution is active (e.g., use JUnit Assumptions inside
the test: call Assumptions.assumeFalse(isGlutenNativeExecutionActive()) or
Assumptions.assumeTrue(!isGlutenNativeExecutionActive()), where
isGlutenNativeExecutionActive() checks the presence of the gluten bundle
(gluten.bundle.jar) or existing project helper), so the test in
TestHoodieFileGroupReaderOnSpark runs normally under standard Spark profiles but
is skipped only when native execution is enabled.
In
`@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestMORDataSource.scala`:
- Around line 1490-1491: The test testRecordTypeCompatibilityWithParquetLog was
changed to force HoodieStorageConfig.LOGFILE_DATA_BLOCK_FORMAT to "avro" (via
readOpts and writeOpts), so it no longer validates parquet log-block behavior;
revert or adjust these assignments so the test actually uses "parquet" (e.g.,
set readOpts and writeOpts to
Map(HoodieStorageConfig.LOGFILE_DATA_BLOCK_FORMAT.key -> "parquet") or remove
the override), or alternatively split/rename the test into separate cases (one
explicitly setting "avro" and another keeping "parquet") so the parquet-log
reader remains covered.
- Around line 2269-2310: The test testGlutenVeloxNativeExecution currently only
checks rows.length > 0 so it passes without Gluten/Velox; update it to first
assumeTrue that the Gluten native bundle is available (skip otherwise) and then
assert that df.queryExecution.executedPlan.toString contains the expected native
operator identifiers (e.g., "Velox" or "Gluten" or the specific operator names
used by your runtime) after collecting/printing the plan; use the existing df
and df.queryExecution.executedPlan references and keep the existing plan
printing but replace or augment the final assertTrue(rows.length > 0) with an
assumeTrue(...) then an assertion that the plan string contains the native
operator marker.
In
`@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestMergeModeCommitTimeOrdering.scala`:
- Around line 260-262: Remove the hardcoded absolute path and debug println: in
the TestMergeModeCommitTimeOrdering test where tablePath is constructed using
TestMergeModeCommitTimeOrdering.cnt and System.currentTimeMillis(), replace the
absolute "/home/ubuntu/..." path with a temp or relative test directory (or use
the test framework's temporary folder utility) and delete the println(s"fdss:
${tablePath}") call; ensure tablePath is built from a configurable base (or
tempDir) so the test is not environment-specific.
- Around line 107-109: The test currently uses a hardcoded absolute path via
tablePath built from TestMergeModeCommitTimeOrdering.cnt and currTime and prints
it with println (a debug artifact); replace this by using the test framework's
temporary directory helper (e.g., withTempDir) to create tablePath inside the
temp dir and remove the println line; locate the usage where
currTime/tablePath/Println are set in TestMergeModeCommitTimeOrdering and
restore the original withTempDir pattern so the path is portable and no debug
output remains.
In
`@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestMORFileSliceLayouts.scala`:
- Around line 48-55: Replace the hard-coded BASE_OUTPUT_PATH and the unsafe
cleanPath usage with a test-scoped temporary directory: change BASE_OUTPUT_PATH
to be created at runtime (e.g., use java.nio.file.Files.createTempDirectory or
the test harness/tempDir utility) and ensure cleanPath operates on that temp dir
only (or remove cleanPath and rely on the temp directory lifecycle). Update
references to BASE_OUTPUT_PATH and the cleanPath helper so all tests use this
generated path, and ensure the temp directory is deleted/cleaned in an
afterEach/afterAll teardown to avoid leaving artifacts.
- Around line 62-68: getQueryFileSlices currently obtains resources from
getMetaClientAndFileSystemView (which returns a HoodieTableFileSystemView) and
iterates log readers but does not guarantee closure on exception paths; update
getQueryFileSlices to ensure the HoodieTableFileSystemView and any
HoodieLogFileReader instances are closed in all exit paths (use try/finally or a
loan/withResource pattern) so that fsView.close() is always called and each log
reader is closed even if an assertion or read fails; locate
getMetaClientAndFileSystemView and any code that constructs HoodieLogFileReader
(also in the 111-131 region) and wrap their usage in try/finally blocks or a
managed-resource helper to release resources reliably.
In `@pom.xml`:
- Line 173: The pom currently sets the Maven property spark35.version to the
internal coordinate "3.5.2-INTERNAL-0.16" which makes the active-by-default
profile spark3.5 depend on private Onehouse CodeArtifact repos; revert the
default <spark35.version> to the public Spark artifact (e.g., a public 3.5.x
coordinate) and move the internal "3.5.2-INTERNAL-0.16" value and the Onehouse
repository declaration into a new opt-in Maven profile (e.g., profile id
"onehouse-internal" or similar); update references to spark35.version to remain
unchanged so external builds use the public coordinate unless the opt-in profile
is activated.
In `@RCA_Gluten_Velox_MOR_Failures.md`:
- Around line 14-22: Several fenced code blocks in the document are missing
language specifiers; update each triple-backtick fence to include an appropriate
language token (e.g., ```java for the Java stacktrace starting with
java.lang.AssertionError and entries like UnsafeRow.setTotalSize, ```scala for
Scala snippets referencing HoodieSparkUtils.createRdd, or ```text/```cpp as
appropriate) so markdownlint stops flagging them; apply this change to the
reported ranges (including the stacktrace block and the other blocks at 32-36,
82-85, 184-203, 206-210, 245-251, 254-264, 320-327, 347-351) ensuring each fence
uses the correct language tag that matches the contained code/sample.
---
Outside diff comments:
In
`@hudi-common/src/test/java/org/apache/hudi/common/table/read/TestFileGroupReaderSchemaHandler.java`:
- Around line 168-175: The CSV-driven test in TestFileGroupReaderSchemaHandler
lost coverage: restore the removed `@CsvSource` rows that include table version
SIX, merge mode CUSTOM, and the populateMetaFields=false permutations so
pre-v9/backwards-compat scenarios are tested again; if those cases must remain
excluded temporarily, add a concise TODO comment next to the `@CsvSource`
explaining why they are omitted and add/mention a tracking issue ID (create one
if none) so re-enabling (or supporting the underlying features) is tracked.
In
`@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderOnSpark.scala`:
- Around line 76-106: The test mutates the shared mutable symbol
supportedFileFormats in setup(), making tests order-dependent; capture the
original value before overwriting (e.g., val previousSupported =
supportedFileFormats) inside setup() and then restore it in teardown()
(supportedFileFormats = previousSupported) so the shared state from
TestHoodieFileGroupReaderBase is returned to its prior value; implement these
changes in the setup and teardown methods that currently set
supportedFileFormats = util.Arrays.asList(HoodieFileFormat.PARQUET).
In
`@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestMORDataSource.scala`:
- Around line 107-122: The test suite in TestMORDataSource.scala has had many
CSV rows removed from the `@CsvSource` for the parameterized test, which
unintentionally shrinks baseline Spark coverage; instead, restore the full
compatibility matrix in the `@CsvSource` and run the reduced Gluten-only matrix
conditionally by gating it behind the Gluten profile/flag (e.g., detect the
Gluten/external bundle flag used in test runtime or use an `@EnabledIf/`@Tag that
checks isGlutenEnabled). Specifically, revert the commented-out rows in the
`@CsvSource` inside the parameterized test and add logic around test execution (or
a second CsvSource variant) so that only when the Gluten flag is true you use
the reduced set; reference the parameterized test in TestMORDataSource.scala and
ensure the same gating is applied to the other affected CsvSource occurrences
noted (around the same test blocks at the other locations).
In
`@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestMergeModeEventTimeOrdering.scala`:
- Around line 34-39: The parametrized Seq in TestMergeModeEventTimeOrdering
currently omits the legacy v6/v8 table-version cases, making the
version-specific expectations in the test body unreachable; update the Seq(...)
used where args is iterated to reintroduce the v6 and v8 compatibility rows (so
the test runs the legacy table-version permutations) ensuring the strings/tuples
you add match the shape consumed by the test loop (the same format used by the
other entries) so the existing version-specific assertions in the test body
execute for v6 and v8.
---
Nitpick comments:
In
`@hudi-common/src/test/java/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderBase.java`:
- Around line 359-362: Uncomment the two parameter lines that use
RecordMergeMode.CUSTOM (the commented arguments(...) calls) and instead of
leaving them commented, disable the associated parameterized test using
`@Disabled` with a short reason; locate the parameter provider or the test method
in TestHoodieFileGroupReaderBase that consumes these arguments (the method that
returns or uses arguments(RecordMergeMode..., HoodieFileFormat..., ...)) and add
`@Disabled`("CUSTOM merge mode tests temporarily disabled - <reason>") to that
test or its provider so the CUSTOM cases are recorded as intentionally disabled
rather than commented out.
In
`@hudi-spark-datasource/hudi-spark/src/test/java/org/apache/hudi/table/functional/TestHoodieSparkMergeOnReadTableRollback.java`:
- Line 110: The tests in TestHoodieSparkMergeOnReadTableRollback are still
written to accept a tableVersion parameter even though the CsvSource only
supplies v9; change the parameterization to only vary the boolean and collapse
version handling to a fixed v9: update the two test annotations (the ones
currently using "@CsvSource({ \"true,9\", \"false,9\"})") to supply just the
boolean, remove the tableVersion parameter from the test method signatures, and
replace any use of tableVersion (including conditional path/build logic) with
the fixed v9 value (or a single v9 constant) so the tests no longer carry dead
version branches.
In
`@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestMergeModeCommitTimeOrdering.scala`:
- Around line 38-39: The COW variants in TestMergeModeCommitTimeOrdering have
been commented out (the parameter list containing "cow,current,..." is removed)
leaving only MOR cases; either restore the COW entries in the args list inside
TestMergeModeCommitTimeOrdering so the tests run, or instead mark the COW cases
explicitly disabled with a clear reason: add `@Disabled`("reason: Gluten/Velox
incompatibility") on the specific test or branch and include an inline comment
explaining why COW is excluded; locate the parameter list passed to args.foreach
and the test method in class TestMergeModeCommitTimeOrdering to apply the
change.
- Around line 31-33: The companion object TestMergeModeCommitTimeOrdering
currently defines a mutable var cnt used for uniqueness; remove this mutable
state and replace all uses of TestMergeModeCommitTimeOrdering.cnt with a
deterministic, thread-safe alternative such as UUID.randomUUID().toString (or
the test framework's temp directory / unique test name helper) wherever unique
names/paths are constructed (search for references to
TestMergeModeCommitTimeOrdering and cnt in the tests); update any setup/teardown
logic to not rely on incrementing cnt and ensure uniqueness is produced inline
at call sites (e.g., generate a UUID or use temporaryFolder/newTempDir) instead.
In `@hudi-tests-common/src/main/resources/log4j2-surefire-quiet.properties`:
- Around line 27-31: Change the log levels for the quiet surefire profile: set
logger.gluten.level and logger.quanton.level from info to warn so
org.apache.gluten and ai.onehouse.quanton follow the quiet profile (root warn)
and avoid producing verbose test output; update the entries for
logger.gluten.name and logger.quanton.name accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e1ffad95-377b-4b59-ae9e-c5404df9642b
📒 Files selected for processing (23)
RCA_Gluten_Velox_MOR_Failures.mdhudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/HoodieSparkUtils.scalahudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/GlutenTestUtils.javahudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/HoodieClientTestUtils.javahudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/providers/SparkProvider.javahudi-common/src/test/java/org/apache/hudi/common/table/read/TestFileGroupReaderSchemaHandler.javahudi-common/src/test/java/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderBase.javahudi-common/src/test/java/org/apache/hudi/common/table/read/buffer/TestKeyBasedFileGroupRecordBuffer.javahudi-spark-datasource/hudi-spark/pom.xmlhudi-spark-datasource/hudi-spark/src/test/java/org/apache/hudi/TestPositionBasedFileGroupRecordBuffer.javahudi-spark-datasource/hudi-spark/src/test/java/org/apache/hudi/io/TestMergeHandle.javahudi-spark-datasource/hudi-spark/src/test/java/org/apache/hudi/table/functional/TestHoodieSparkCopyOnWriteTableRollbackTableVersionSix.javahudi-spark-datasource/hudi-spark/src/test/java/org/apache/hudi/table/functional/TestHoodieSparkMergeOnReadTableRollback.javahudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderOnSpark.scalahudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestMORDataSource.scalahudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/common/HoodieSparkSqlTestBase.scalahudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestMORFileSliceLayouts.scalahudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestMergeModeCommitTimeOrdering.scalahudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestMergeModeEventTimeOrdering.scalahudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestPartialUpdateForMergeInto.scalahudi-tests-common/src/main/resources/log4j2-surefire-quiet.propertieshudi-tests-common/src/main/resources/log4j2-surefire.propertiespom.xml
| // Additionally, we have to explicitly wrap around resulting [[RDD]] into the one | ||
| // injecting [[SQLConf]], which by default isn't propagated by Spark to the executor(s). | ||
| // [[SQLConf]] is required by [[AvroSerializer]] | ||
| logWarning(s"createRdd executedPlan:\n${df.queryExecution.executedPlan.treeString}") |
There was a problem hiding this comment.
Avoid warning-level plan dumps on the createRdd hot path.
createRdd is on the write path, and executedPlan.treeString eagerly materializes the full plan string. Logging that at WARN for every call will add a lot of noise and overhead in production. Please keep this behind logDebug or a dedicated diagnostic flag instead.
Suggested change
- logWarning(s"createRdd executedPlan:\n${df.queryExecution.executedPlan.treeString}")
+ logDebug(s"createRdd executedPlan:\n${df.queryExecution.executedPlan.treeString}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| logWarning(s"createRdd executedPlan:\n${df.queryExecution.executedPlan.treeString}") | |
| logDebug(s"createRdd executedPlan:\n${df.queryExecution.executedPlan.treeString}") |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/HoodieSparkUtils.scala`
at line 106, The current logWarning in HoodieSparkUtils.createRdd eagerly
materializes df.queryExecution.executedPlan.treeString on a hot write path;
change it to avoid WARN-level plan dumps by either moving the message to
logDebug or gating it behind a diagnostic flag. Locate the logWarning call
referencing df.queryExecution.executedPlan.treeString in createRdd and replace
it so the plan string is only generated when logDebug is enabled or when a new
boolean diagnostic flag (e.g., enablePlanDump) is true; ensure you check
logger.isDebugEnabled (or the flag) before calling treeString to prevent
unnecessary work.
| public static void applyGlutenConf(SparkConf sparkConf) { | ||
| String glutenBundleJar = System.getProperty("gluten.bundle.jar"); | ||
| if (glutenBundleJar == null || glutenBundleJar.isEmpty()) { | ||
| return; | ||
| } | ||
|
|
||
| String pluginClass; | ||
| try { | ||
| Class.forName("ai.onehouse.quanton.QuantonPlugin"); | ||
| pluginClass = "ai.onehouse.quanton.QuantonPlugin"; | ||
| } catch (ClassNotFoundException e) { | ||
| pluginClass = "org.apache.gluten.GlutenPlugin"; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
fd -t f "GlutenTestUtils.java"Repository: yihua/hudi
Length of output: 144
🏁 Script executed:
cat -n hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/GlutenTestUtils.javaRepository: yihua/hudi
Length of output: 3358
🏁 Script executed:
# Search for similar validation patterns in the codebase for system properties
rg "System.getProperty" --type java -A 5 -B 2 | head -100Repository: yihua/hudi
Length of output: 13994
🏁 Script executed:
# Check if there are any tests or usages of applyGlutenConf
rg "applyGlutenConf" --type java -B 2 -A 5Repository: yihua/hudi
Length of output: 3052
Add fail-fast validation for gluten.bundle.jar property and fallback plugin class.
The method currently treats gluten.bundle.jar as a boolean flag only. A typo in the path or missing classpath wiring still configures SparkConf and defers the failure to Spark startup, making test diagnostics harder. Validate the file path and ensure the fallback GlutenPlugin class exists before enabling the plugin mode.
Suggested fail-fast check
+import java.io.File;
+
public static void applyGlutenConf(SparkConf sparkConf) {
String glutenBundleJar = System.getProperty("gluten.bundle.jar");
if (glutenBundleJar == null || glutenBundleJar.isEmpty()) {
return;
}
+ if (!new File(glutenBundleJar).isFile()) {
+ throw new IllegalArgumentException("gluten.bundle.jar does not point to an existing file: " + glutenBundleJar);
+ }
String pluginClass;
try {
Class.forName("ai.onehouse.quanton.QuantonPlugin");
pluginClass = "ai.onehouse.quanton.QuantonPlugin";
} catch (ClassNotFoundException e) {
- pluginClass = "org.apache.gluten.GlutenPlugin";
+ try {
+ Class.forName("org.apache.gluten.GlutenPlugin");
+ pluginClass = "org.apache.gluten.GlutenPlugin";
+ } catch (ClassNotFoundException e2) {
+ throw new IllegalStateException(
+ "gluten.bundle.jar is set but neither Quanton nor Gluten plugin is on the test classpath",
+ e2);
+ }
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/testutils/GlutenTestUtils.java`
around lines 42 - 54, The applyGlutenConf method currently treats the
gluten.bundle.jar string as a flag; update it to fail-fast by validating the
path and plugin class before mutating SparkConf: 1) check that the
glutenBundleJar string is non-empty and points to an existing readable file
(e.g., new File(glutenBundleJar).isFile()) and if not, throw an
IllegalArgumentException or return early so SparkConf is not modified; 2)
resolve the plugin class by first trying
Class.forName("ai.onehouse.quanton.QuantonPlugin") and if that fails set
pluginClass = "org.apache.gluten.GlutenPlugin", then also verify the fallback
exists via Class.forName(pluginClass) and return/throw if it does not; 3) only
after both the jar file and the chosen pluginClass are validated, set the
SparkConf entries (the code that writes to sparkConf) so misconfigured paths or
missing classes are detected immediately in applyGlutenConf.
| @Disabled("Custom delete payload not supported") | ||
| @Test | ||
| void readWithCommitTimeOrdering() throws IOException { |
There was a problem hiding this comment.
Disabled reason may be misleading for test name.
The test readWithCommitTimeOrdering is disabled with "Custom delete payload not supported", which is technically correct (the test uses custom delete markers), but the test name suggests it's primarily about commit time ordering behavior. Consider clarifying the disable reason or renaming the test if custom delete payload support is the actual blocker.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@hudi-common/src/test/java/org/apache/hudi/common/table/read/buffer/TestKeyBasedFileGroupRecordBuffer.java`
around lines 181 - 183, The `@Disabled` reason on the test method
readWithCommitTimeOrdering is misleading; update the disable annotation or the
test name so the intent is clear: either change the `@Disabled` message to
explicitly state "Uses custom delete payload (not supported)" or rename the
method (e.g., to readWithCommitTimeOrderingAndCustomDeletePayload) to reflect
that the failure is due to custom delete payloads, not commit-time ordering.
Locate the test method readWithCommitTimeOrdering and apply one of these changes
so the disable reason and test identifier align.
| arguments(RecordMergeMode.COMMIT_TIME_ORDERING, "avro"), | ||
| arguments(RecordMergeMode.EVENT_TIME_ORDERING, "parquet"), | ||
| arguments(RecordMergeMode.CUSTOM, "avro")); | ||
| arguments(RecordMergeMode.EVENT_TIME_ORDERING, "avro")); |
There was a problem hiding this comment.
Log file only test cases reduced to avro format only.
logFileOnlyCases() now only tests avro log data block format, removing parquet format coverage. If parquet log blocks are supported in production, this reduces test coverage.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@hudi-common/src/test/java/org/apache/hudi/common/table/read/TestHoodieFileGroupReaderBase.java`
around lines 451 - 452, The test parameterization was narrowed to only the
"avro" log data block format, removing "parquet" coverage; update the parameter
list used by the test (in TestHoodieFileGroupReaderBase, the method supplying
arguments for logFileOnlyCases) to include both "avro" and "parquet" again
(e.g., add corresponding arguments(RecordMergeMode.COMMIT_TIME_ORDERING,
"parquet") and arguments(RecordMergeMode.EVENT_TIME_ORDERING, "parquet") so the
test runs for both formats).
| @ParameterizedTest | ||
| @ValueSource(strings = {"EVENT_TIME_ORDERING", "COMMIT_TIME_ORDERING", "CUSTOM", "CUSTOM_MERGER"}) | ||
| @ValueSource(strings = {"EVENT_TIME_ORDERING", "COMMIT_TIME_ORDERING"}) | ||
| public void testFGReaderBasedMergeHandleInsertUpsertDelete(String mergeMode) throws IOException { | ||
| testFGReaderBasedMergeHandleInsertUpsertDeleteInternal(mergeMode, new Properties(), false); |
There was a problem hiding this comment.
Keep explicit coverage for CUSTOM merge modes.
testFGReaderBasedMergeHandleInsertUpsertDeleteInternal(...) still has dedicated CUSTOM / CUSTOM_MERGER branches, so removing those values from @ValueSource just makes those paths untested. If they are unsupported only in the Gluten run, gate them there or replace them with an explicit unsupported assertion instead of dropping the coverage globally.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@hudi-spark-datasource/hudi-spark/src/test/java/org/apache/hudi/io/TestMergeHandle.java`
around lines 280 - 283, The test removed CUSTOM/CUSTOM_MERGER branch coverage;
restore explicit coverage by including "CUSTOM" and "CUSTOM_MERGER" in the
`@ValueSource` for testFGReaderBasedMergeHandleInsertUpsertDelete or, if those
modes are genuinely unsupported under the Gluten environment, add a guard inside
testFGReaderBasedMergeHandleInsertUpsertDelete (calling
testFGReaderBasedMergeHandleInsertUpsertDeleteInternal) that detects the Gluten
run (e.g., an existing isGluten()/environment check) and asserts the unsupported
behavior for CUSTOM/CUSTOM_MERGER instead of silently skipping them; update
references to testFGReaderBasedMergeHandleInsertUpsertDeleteInternal, the
ValueSource annotation, and any Gluten-run detection logic accordingly.
| val currTime = System.currentTimeMillis() | ||
| val tablePath = s"/home/ubuntu/ws3/hudi-rs/tables/table${TestMergeModeCommitTimeOrdering.cnt}_$currTime" | ||
| println(s"fdss: ${tablePath}") |
There was a problem hiding this comment.
Critical: Same hardcoded path and debug statement in second test.
This is a duplicate of the issue in the first test - hardcoded absolute path and debug println must be removed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestMergeModeCommitTimeOrdering.scala`
around lines 260 - 262, Remove the hardcoded absolute path and debug println: in
the TestMergeModeCommitTimeOrdering test where tablePath is constructed using
TestMergeModeCommitTimeOrdering.cnt and System.currentTimeMillis(), replace the
absolute "/home/ubuntu/..." path with a temp or relative test directory (or use
the test framework's temporary folder utility) and delete the println(s"fdss:
${tablePath}") call; ensure tablePath is built from a configurable base (or
tempDir) so the test is not environment-specific.
| val BASE_OUTPUT_PATH = "/home/ubuntu/ws3/hudi-rs/tables/fdss" | ||
|
|
||
| private def cleanPath(path: String): Unit = { | ||
| val hadoopPath = new Path(path) | ||
| val fs = hadoopPath.getFileSystem(spark.sessionState.newHadoopConf()) | ||
| if (fs.exists(hadoopPath)) { | ||
| fs.delete(hadoopPath, true) | ||
| } |
There was a problem hiding this comment.
Don't hard-code a workstation path into an automated test suite.
These tests assume /home/ubuntu/... exists and cleanPath recursively deletes under it. That will fail on CI/non-Ubuntu hosts and is unsafe on developer machines. Use a harness-provided temp directory instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestMORFileSliceLayouts.scala`
around lines 48 - 55, Replace the hard-coded BASE_OUTPUT_PATH and the unsafe
cleanPath usage with a test-scoped temporary directory: change BASE_OUTPUT_PATH
to be created at runtime (e.g., use java.nio.file.Files.createTempDirectory or
the test harness/tempDir utility) and ensure cleanPath operates on that temp dir
only (or remove cleanPath and rely on the temp directory lifecycle). Update
references to BASE_OUTPUT_PATH and the cleanPath helper so all tests use this
generated path, and ensure the temp directory is deleted/cleaned in an
afterEach/afterAll teardown to avoid leaving artifacts.
| private def getQueryFileSlices(basePath: String): Seq[FileSlice] = { | ||
| val (metaClient, fsView) = getMetaClientAndFileSystemView(basePath) | ||
| val queryInstant = metaClient.getActiveTimeline | ||
| .filterCompletedInstants().lastInstant().get().requestedTime() | ||
| fsView.getLatestMergedFileSlicesBeforeOrOn("", queryInstant) | ||
| .iterator().asScala.toSeq | ||
| } |
There was a problem hiding this comment.
Close the file-system view and log readers on every exit path.
getMetaClientAndFileSystemView creates a HoodieTableFileSystemView that is never closed, and each HoodieLogFileReader only closes on the happy path. A failing read/assertion here will leak handles across this suite.
Also applies to: 111-131
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/others/TestMORFileSliceLayouts.scala`
around lines 62 - 68, getQueryFileSlices currently obtains resources from
getMetaClientAndFileSystemView (which returns a HoodieTableFileSystemView) and
iterates log readers but does not guarantee closure on exception paths; update
getQueryFileSlices to ensure the HoodieTableFileSystemView and any
HoodieLogFileReader instances are closed in all exit paths (use try/finally or a
loan/withResource pattern) so that fsView.close() is always called and each log
reader is closed even if an assertion or read fails; locate
getMetaClientAndFileSystemView and any code that constructs HoodieLogFileReader
(also in the 111-131 region) and wrap their usage in try/finally blocks or a
managed-resource helper to release resources reliably.
| <spark33.version>3.3.4</spark33.version> | ||
| <spark34.version>3.4.3</spark34.version> | ||
| <spark35.version>3.5.5</spark35.version> | ||
| <spark35.version>3.5.2-INTERNAL-0.16</spark35.version> |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let me examine the pom.xml file structure and the specific lines mentioned
wc -l pom.xmlRepository: yihua/hudi
Length of output: 66
🏁 Script executed:
# Check the area around line 173 for the spark35.version
sed -n '170,180p' pom.xmlRepository: yihua/hudi
Length of output: 631
🏁 Script executed:
# Check the area around lines 1966-1970 for the Onehouse CodeArtifact repository
sed -n '1960,1975p' pom.xmlRepository: yihua/hudi
Length of output: 479
🏁 Script executed:
# Search for profile definitions to verify which is the default profile
rg -n '<profile>' pom.xml | head -20Repository: yihua/hudi
Length of output: 433
🏁 Script executed:
# Look for activeByDefault or default profile settings
rg -n -A 3 'activeByDefault\|<id>spark3' pom.xmlRepository: yihua/hudi
Length of output: 36
🏁 Script executed:
# Search for spark3.5 profile definition and its activation settings
rg -n -A 15 '<id>spark3\.5' pom.xmlRepository: yihua/hudi
Length of output: 1058
🏁 Script executed:
# Look for all profiles with their IDs and activation settings
rg -n -B 2 -A 8 '<profile>' pom.xml | grep -E '(profile>|<id>|activeByDefault|activation)'Repository: yihua/hudi
Length of output: 2375
🏁 Script executed:
# Check line 1974 onwards to see the first profile definition after repositories
sed -n '1974,2050p' pom.xmlRepository: yihua/hudi
Length of output: 2346
🏁 Script executed:
# Get the complete spark3.5 profile definition to check for activeByDefault
sed -n '2656,2715p' pom.xmlRepository: yihua/hudi
Length of output: 3074
🏁 Script executed:
# Check if there are any activeByDefault declarations in profiles
rg -n 'activeByDefault' pom.xmlRepository: yihua/hudi
Length of output: 159
🏁 Script executed:
# Check properties section to understand default spark module
sed -n '140,200p' pom.xmlRepository: yihua/hudi
Length of output: 3710
This makes the default Spark 3.5 build depend on private infrastructure.
spark3.5 is the active-by-default profile, and the default properties use ${spark35.version} which now points to 3.5.2-INTERNAL-0.16. With the Onehouse CodeArtifact repository configured in the global repositories section, every default build will attempt to resolve this internal coordinate from private AWS infrastructure. Public CI and external contributors cannot access this repository without Onehouse credentials. Please keep the public Spark coordinate as default and move the internal version and repository behind an explicit opt-in profile.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pom.xml` at line 173, The pom currently sets the Maven property
spark35.version to the internal coordinate "3.5.2-INTERNAL-0.16" which makes the
active-by-default profile spark3.5 depend on private Onehouse CodeArtifact
repos; revert the default <spark35.version> to the public Spark artifact (e.g.,
a public 3.5.x coordinate) and move the internal "3.5.2-INTERNAL-0.16" value and
the Onehouse repository declaration into a new opt-in Maven profile (e.g.,
profile id "onehouse-internal" or similar); update references to spark35.version
to remain unchanged so external builds use the public coordinate unless the
opt-in profile is activated.
| ``` | ||
| java.lang.AssertionError: sizeInBytes (276) should be a multiple of 8 | ||
| at org.apache.spark.sql.catalyst.expressions.UnsafeRow.setTotalSize(UnsafeRow.java:172) | ||
| at org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter.getRow(UnsafeRowWriter.java:78) | ||
| at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) | ||
| at org.apache.spark.sql.catalyst.expressions.codegen.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) | ||
| at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$...$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43) | ||
| at HoodieSparkUtils$.$anonfun$createRdd$2(HoodieSparkUtils.scala:107) | ||
| ``` |
There was a problem hiding this comment.
Add languages to the fenced code blocks.
markdownlint is already flagging these fences, so this doc will keep warning until the blocks are tagged (text, scala, java, cpp, etc.).
Also applies to: 32-36, 82-85, 184-203, 206-210, 245-251, 254-264, 320-327, 347-351
🧰 Tools
🪛 markdownlint-cli2 (0.22.0)
[warning] 14-14: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@RCA_Gluten_Velox_MOR_Failures.md` around lines 14 - 22, Several fenced code
blocks in the document are missing language specifiers; update each
triple-backtick fence to include an appropriate language token (e.g., ```java
for the Java stacktrace starting with java.lang.AssertionError and entries like
UnsafeRow.setTotalSize, ```scala for Scala snippets referencing
HoodieSparkUtils.createRdd, or ```text/```cpp as appropriate) so markdownlint
stops flagging them; apply this change to the reported ranges (including the
stacktrace block and the other blocks at 32-36, 82-85, 184-203, 206-210,
245-251, 254-264, 320-327, 347-351) ensuring each fence uses the correct
language tag that matches the contained code/sample.
Mirror of apache#18455 for automated bot review.
Original author: @Davis-Zhang-Onehouse
Base branch: master
Summary by CodeRabbit
Documentation
Tests
Chores