[OSS PR #18083] feat(flink): Implement continuous sorting feature for append write by yihua · Pull Request #46 · yihua/hudi

yihua · 2026-04-16T18:24:30Z

Mirror of apache#18083 for automated bot review.

Original author: @prashantwason
Base branch: master

Summary by CodeRabbit

Release Notes

New Features
- Added CONTINUOUS_SORT buffer type for write buffering using TreeMap-based continuous sorting with incremental draining.
- Added write.buffer.sort.continuous.drain.size configuration option (default: 1) to control record draining per step during continuous sorting.
Tests
- Added integration tests validating continuous sort buffering, including buffer flushing on memory and record limits, checkpoint handling, drain behavior, and sorting correctness verification.

… sorted order incrementally and avoid single-partition lag during ingestion by reducing large pause time from sort and backpressure Summary: - Added AppendWriteFunctionWithContinuousSort which keeps records in a TreeMap keyed by a code-generated normalized key and an insertion sequence, drains oldest entries when a configurable threshold is reached, and writes drained records immediately; snapshot/endInput drain remaining records. - Updated AppendWriteFunctions.create to instantiate the continuous sorter when WRITE_BUFFER_SORT_CONTINUOUS_ENABLED is true. - Introduced three new FlinkOptions: WRITE_BUFFER_SORT_CONTINUOUS_ENABLED, WRITE_BUFFER_SORT_CONTINUOUS_DRAIN_THRESHOLD_PERCENT, and WRITE_BUFFER_SORT_CONTINUOUS_DRAIN_SIZE, and added runtime validation (buffer > 0, 0 < threshold < 100, drainSize > 0, parsed non-empty sort keys). - Added ITTestAppendWriteFunctionWithContinuousSort integration tests covering buffer flush triggers, sorted output correctness (with and without continuous drain), drain threshold/size behaviors, and invalid-parameter error cases. Verified that we can push higher CPU utilization and input rate with continuous sorting enabled (See before and after 11:30AM)

- Move config validation before super.open() to avoid IllegalStateException when Flink runtime context is not initialized (fixes 5 validation tests) - Add RecordComparator for full comparison fallback when normalized keys are equal, ensuring correct multi-field sort order (fixes 2 sort tests) - Store RowData reference in SortKey for RecordComparator access

…ject reuse, cleanup - Replace WRITE_BUFFER_SORT_CONTINUOUS_ENABLED with WRITE_BUFFER_TYPE=CONTINUOUS_SORT - Add CONTINUOUS_SORT to BufferType enum - Use AppendWriteFunctions.resolveSortKeys() for sort key resolution with record key fallback - Copy RowData when Flink object reuse is enabled to prevent mutation - Move sortedRecords.clear()/insertionSequence reset inside if branches - Fix stale class name reference in javadoc - Update test to use new WRITE_BUFFER_TYPE config Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…it operations" This reverts commit 847773a.

…e test - Fix isObjectReuseEnabled() compilation on Flink 1.17/1.18 by using getExecutionConfig().isObjectReuseEnabled() instead of RuntimeContext method - Remove unused imports (java.util.Arrays, java.util.stream.Collectors) to fix checkstyle violations - Move super.snapshotState() and super.endInput() out of finally clauses to match other append write functions (avoid sending write meta event on error) - Optimize TreeMap comparator: store pre-wrapped MemorySegment in SortKey to avoid per-comparison allocation and reduce GC pressure - Filter empty strings after trim in resolveSortKeys() to properly validate whitespace-only and comma-only sort key configs - Add testObjectReuseEnabled test that verifies records are correctly copied when Flink object reuse is enabled (prevents TreeMap corruption) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…useEnabled getExecutionConfig() was removed from RuntimeContext in Flink 2.0+. Add isObjectReuseEnabled() to version-specific RuntimeContextUtils adapters: - Flink 1.17-1.20: uses getExecutionConfig().isObjectReuseEnabled() - Flink 2.0-2.1: uses getGlobalJobParameters() with PipelineOptions.OBJECT_REUSE Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…rectly for Flink 2.0+/2.1+ Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Add TotalSizeTracer to bound buffer memory footprint using write.task.max.size, triggering drain when memory limit is exceeded in addition to record count limit - Replace Iterator-based drain loop with TreeMap.pollFirstEntry() to avoid iterator allocation per drain call (matters when drainSize=1) - Estimate record size once via ObjectSizeCalculator and track cumulative memory via TotalSizeTracer.trace()/countDown()/reset() Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai · 2026-04-16T18:26:13Z

📝 Walkthrough

Walkthrough

This PR introduces continuous sorting functionality for Hudi Flink datasource. A new TreeMap-based buffer type (CONTINUOUS_SORT) enables in-memory sorted write operations with configurable incremental draining. The implementation includes new configuration options, a complete write function with lifecycle management, comprehensive integration tests, and utility enhancements for object reuse support across Flink versions.

Changes

Cohort / File(s)	Summary
Configuration & Buffer Type `hudi-flink/.../FlinkOptions.java`, `hudi-flink/.../BufferType.java`	Added `WRITE_BUFFER_SORT_CONTINUOUS_DRAIN_SIZE` config option (default 1) and introduced `CONTINUOUS_SORT` enum constant with TreeMap-based sorting description.
Core Implementation `hudi-flink/.../AppendWriteFunctionWithContinuousSort.java`	New write function extending `AppendWriteFunction` with TreeMap-based continuous sorting, normalized key computation, configurable drain batching, and Flink lifecycle integration (open, processElement, snapshotState, endInput, close).
Write Function Factory `hudi-flink/.../AppendWriteFunctions.java`	Updated `create()` to recognize `CONTINUOUS_SORT` buffer type and instantiate `AppendWriteFunctionWithContinuousSort`; enhanced `resolveSortKeys()` to trim segments and validate non-empty results.
Test Infrastructure & Mocking `hudi-flink/.../InsertFunctionWrapper.java`, `hudi-flink/.../MockStreamingRuntimeContext.java`, `hudi-flink/.../TestWriteBase.java`, `hudi-flink/.../TestData.java`	Extended constructors and methods to support `ExecutionConfig` propagation for object reuse testing; added `preparePipelineWithObjectReuse()` and `getWritePipelineWithObjectReuse()` helper methods.
Integration Test Suite `hudi-flink/.../ITTestAppendWriteFunctionWithContinuousSort.java`	Comprehensive test class with 15+ methods validating buffer flushing on record/size/memory limits, checkpoint behavior, drain configuration, sorting correctness, error handling, and object reuse scenarios.
Flink Version Utilities `hudi-flink*/.../RuntimeContextUtils.java` (1.17.x, 1.18.x, 1.19.x, 1.20.x, 2.0.x, 2.1.x)	Added `isObjectReuseEnabled(RuntimeContext)` utility method across all Flink version modules with version-appropriate delegation patterns.

Sequence Diagram

sequenceDiagram
    participant Client as Client
    participant Processor as AppendWriteFunctionWithContinuousSort
    participant TreeMap as TreeMap Buffer
    participant Writer as Underlying Writer
    participant State as State Backend

    Client->>Processor: processElement(RowData)
    Note over Processor: Compute normalized key
    Processor->>TreeMap: insert(SortKey, record)
    Note over Processor: Check buffer capacity/memory
    alt Buffer exceeds limit
        Processor->>TreeMap: pollFirstEntry() x drain_size
        TreeMap-->>Processor: oldest records
        Processor->>Writer: flush records
    end
    
    Client->>Processor: snapshotState() / endInput()
    Processor->>TreeMap: pollFirstEntry() until empty
    TreeMap-->>Processor: all buffered records
    Processor->>Writer: flush remaining
    Processor->>State: save checkpoint state
    
    Client->>Processor: close()
    Processor->>Writer: delegate to super.close()

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 TreeMaps in streams, sorting as they flow,
Continuous drains keep the buffer low,
In Flink we hop, with normalized keys so neat,
Memory tracked, sorted records complete!
Object reuse tested—a rabbit's feat! 🌟

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 7.41% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly and concisely describes the main change: implementing continuous sorting for append write operations in Flink, which directly aligns with the substantive changes across all modified files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch oss-18083

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (4)

hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/append/ITTestAppendWriteFunctionWithContinuousSort.java (1)
64-81: Note: before method receives unused @TempDir parameter.

The @BeforeEach method signature includes @TempDir File tempDir but uses tempFile instead. This is likely intentional (using the class field), but the parameter is unused.
♻️ Remove unused parameter
   `@BeforeEach`
-  public void before(`@TempDir` File tempDir) throws Exception {
+  public void before() throws Exception {
     this.conf = TestConfigurations.getDefaultConf(tempFile.getAbsolutePath());
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/append/ITTestAppendWriteFunctionWithContinuousSort.java`
around lines 64 - 81, The before(`@TempDir` File tempDir) method declares a
tempDir parameter but never uses it (the method uses the class field tempFile
instead); either remove the unused `@TempDir` parameter from the before method
signature or replace usages of tempFile with the provided tempDir so the
temporary directory is actually used; update the method signature and any
references in the before method (method name: before, symbols: `@TempDir`,
tempDir, tempFile) accordingly to eliminate the unused-parameter warning and
ensure the test uses the intended temp directory.
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/append/AppendWriteFunctionWithContinuousSort.java (3)
246-264: Minor: Redundant sortedRecords.clear() after draining all records.

Line 251 drains all records using drainRecords(sortedRecords.size()), and since drainRecords uses pollFirstEntry(), the map will already be empty. The subsequent sortedRecords.clear() on line 252 is redundant.

This is not harmful, just unnecessary.
♻️ Optional cleanup
       if (!sortedRecords.isEmpty()) {
         LOG.info("Snapshot: draining {} remaining records", sortedRecords.size());
         drainRecords(sortedRecords.size());
-        sortedRecords.clear();
         insertionSequence = 0L;
         sizeTracer.reset();
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/append/AppendWriteFunctionWithContinuousSort.java`
around lines 246 - 264, In snapshotState(), remove the redundant call to
sortedRecords.clear() after drainRecords(sortedRecords.size()) since
drainRecords uses pollFirstEntry() and already empties sortedRecords; keep the
rest of the logic (logging, resetting insertionSequence and sizeTracer, and the
IOException handling) intact and ensure snapshotState() still calls
super.snapshotState() at the end.
266-282: Same redundancy in endInput() as noted for snapshotState().

Line 273 sortedRecords.clear() is redundant after drainRecords(sortedRecords.size()).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/append/AppendWriteFunctionWithContinuousSort.java`
around lines 266 - 282, In endInput(), remove the redundant
sortedRecords.clear() call after drainRecords(sortedRecords.size()) since
drainRecords already empties the buffer; keep the subsequent state resets
(insertionSequence = 0L and sizeTracer.reset()) and preserve the IOException
handling that wraps drainRecords into a HoodieIOException; update the method to
rely on drainRecords to clear sortedRecords and eliminate the unnecessary clear
invocation.
209-212: Consider variable record sizes for memory tracking.

estimatedRecordSize is captured from the first record only. For schemas with variable-length fields (e.g., strings), this estimate may become increasingly inaccurate over time, potentially leading to premature or delayed draining.

Consider periodically re-sampling or using a rolling average for more accurate memory tracking. This is a minor concern if record sizes are relatively uniform.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/append/AppendWriteFunctionWithContinuousSort.java`
around lines 209 - 212, The current memory tracking in
AppendWriteFunctionWithContinuousSort uses a single fixed estimatedRecordSize
(set via ObjectSizeCalculator.getObjectSize(data) on the first record) which can
be inaccurate for variable-length records; modify the logic that sets and uses
estimatedRecordSize (and the sizeTracer.trace call) to periodically re-sample or
compute a rolling/EMA average of record sizes (e.g., every N records or using an
exponential moving average that incorporates
ObjectSizeCalculator.getObjectSize(data) for sampled records) and use that
updated estimate in sizeTracer.trace so memory tracking adapts to changing
record sizes while keeping updates lightweight and thread-safe.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/append/ITTestAppendWriteFunctionWithContinuousSort.java`:
- Around line 61-62: The class ITTestAppendWriteFunctionWithContinuousSort
currently redeclares the JUnit TempDir field as "protected File tempFile",
shadowing the same "@TempDir protected File tempFile" declared in the superclass
TestWriteBase; remove the local declaration of tempFile from
ITTestAppendWriteFunctionWithContinuousSort so the test uses the inherited
TestWriteBase.tempFile, and update any local references in that class to rely on
the inherited field name (no other changes required).

---

Nitpick comments:
In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/append/AppendWriteFunctionWithContinuousSort.java`:
- Around line 246-264: In snapshotState(), remove the redundant call to
sortedRecords.clear() after drainRecords(sortedRecords.size()) since
drainRecords uses pollFirstEntry() and already empties sortedRecords; keep the
rest of the logic (logging, resetting insertionSequence and sizeTracer, and the
IOException handling) intact and ensure snapshotState() still calls
super.snapshotState() at the end.
- Around line 266-282: In endInput(), remove the redundant sortedRecords.clear()
call after drainRecords(sortedRecords.size()) since drainRecords already empties
the buffer; keep the subsequent state resets (insertionSequence = 0L and
sizeTracer.reset()) and preserve the IOException handling that wraps
drainRecords into a HoodieIOException; update the method to rely on drainRecords
to clear sortedRecords and eliminate the unnecessary clear invocation.
- Around line 209-212: The current memory tracking in
AppendWriteFunctionWithContinuousSort uses a single fixed estimatedRecordSize
(set via ObjectSizeCalculator.getObjectSize(data) on the first record) which can
be inaccurate for variable-length records; modify the logic that sets and uses
estimatedRecordSize (and the sizeTracer.trace call) to periodically re-sample or
compute a rolling/EMA average of record sizes (e.g., every N records or using an
exponential moving average that incorporates
ObjectSizeCalculator.getObjectSize(data) for sampled records) and use that
updated estimate in sizeTracer.trace so memory tracking adapts to changing
record sizes while keeping updates lightweight and thread-safe.

In
`@hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/append/ITTestAppendWriteFunctionWithContinuousSort.java`:
- Around line 64-81: The before(`@TempDir` File tempDir) method declares a tempDir
parameter but never uses it (the method uses the class field tempFile instead);
either remove the unused `@TempDir` parameter from the before method signature or
replace usages of tempFile with the provided tempDir so the temporary directory
is actually used; update the method signature and any references in the before
method (method name: before, symbols: `@TempDir`, tempDir, tempFile) accordingly
to eliminate the unused-parameter warning and ensure the test uses the intended
temp directory.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9528ddcc-febe-47a1-91cf-8782b99a385e

📥 Commits

Reviewing files that changed from the base of the PR and between 35e2bbf and 1caf9fa.

📒 Files selected for processing (15)

hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/append/AppendWriteFunctionWithContinuousSort.java
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/append/AppendWriteFunctions.java
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/buffer/BufferType.java
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/append/ITTestAppendWriteFunctionWithContinuousSort.java
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/InsertFunctionWrapper.java
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/MockStreamingRuntimeContext.java
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/TestWriteBase.java
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/utils/TestData.java
hudi-flink-datasource/hudi-flink1.17.x/src/main/java/org/apache/hudi/utils/RuntimeContextUtils.java
hudi-flink-datasource/hudi-flink1.18.x/src/main/java/org/apache/hudi/utils/RuntimeContextUtils.java
hudi-flink-datasource/hudi-flink1.19.x/src/main/java/org/apache/hudi/utils/RuntimeContextUtils.java
hudi-flink-datasource/hudi-flink1.20.x/src/main/java/org/apache/hudi/utils/RuntimeContextUtils.java
hudi-flink-datasource/hudi-flink2.0.x/src/main/java/org/apache/hudi/utils/RuntimeContextUtils.java
hudi-flink-datasource/hudi-flink2.1.x/src/main/java/org/apache/hudi/utils/RuntimeContextUtils.java

coderabbitai · 2026-04-16T18:32:18Z

+  @TempDir
+  protected File tempFile;


⚠️ Potential issue | 🟡 Minor

Shadowed @TempDir tempFile field.

This class declares @TempDir protected File tempFile (line 61-62), but TestWriteBase already declares @TempDir protected File tempFile (inherited). This shadowing may cause confusion and could lead to unexpected behavior if both are initialized differently by JUnit.

Consider removing the local declaration since the inherited field should suffice.

🔧 Proposed fix

public class ITTestAppendWriteFunctionWithContinuousSort extends TestWriteBase { private Configuration conf; private RowType rowType; - - `@TempDir` - protected File tempFile;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/append/ITTestAppendWriteFunctionWithContinuousSort.java` around lines 61 - 62, The class ITTestAppendWriteFunctionWithContinuousSort currently redeclares the JUnit TempDir field as "protected File tempFile", shadowing the same "@TempDir protected File tempFile" declared in the superclass TestWriteBase; remove the local declaration of tempFile from ITTestAppendWriteFunctionWithContinuousSort so the test uses the inherited TestWriteBase.tempFile, and update any local references in that class to rely on the inherited field name (no other changes required).

greptile-apps · 2026-04-16T18:33:26Z

Greptile Summary

This PR introduces a new CONTINUOUS_SORT buffer mode (AppendWriteFunctionWithContinuousSort) for the Flink append write pipeline. Instead of the batch-sort approaches used by DISRUPTOR/BIM, records are maintained in a TreeMap keyed by a normalized sort key, which provides O(log n) inserts and incremental synchronous draining when the buffer reaches capacity. The goal is to eliminate sort-spike latency at the cost of not guaranteeing globally sorted output across drain boundaries.

Key changes:

New AppendWriteFunctionWithContinuousSort class with TreeMap-based sorting, object-reuse safety, and memory-bounded draining via TotalSizeTracer.
New WRITE_BUFFER_SORT_CONTINUOUS_DRAIN_SIZE config option to control how many records are drained per trigger.
CONTINUOUS_SORT added to the BufferType enum and wired into AppendWriteFunctions.create().
isObjectReuseEnabled() helper added to all six RuntimeContextUtils implementations (Flink 1.17–2.1).
Test infrastructure extended with object-reuse simulation support.
resolveSortKeys() hardened to filter empty tokens after splitting on commas.

Confidence Score: 3/5

Functionally correct for happy-path usage, but the fixed estimatedRecordSize from the first record is a production reliability risk for variable-length workloads.

The feature logic is sound — object reuse is handled, validation is thorough, and the cross-version RuntimeContextUtils additions are correct. However, computing estimatedRecordSize once from the first record and applying it uniformly to all records means the memory-based flush trigger can significantly underestimate actual buffer size for variable-length fields, risking OOM. The post-drain exception guard is also unreachable dead code. These issues warrant targeted fixes before merge.

AppendWriteFunctionWithContinuousSort.java — specifically the estimatedRecordSize tracking logic (lines 209-212) and the unreachable guard (lines 183-193).

Important Files Changed

Filename	Overview
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/append/AppendWriteFunctionWithContinuousSort.java	New core class: TreeMap-based continuous-sort sink. Key issues: fixed `estimatedRecordSize` computed from first record only (P1), unreachable post-drain exception guard (P2), redundant `sortedRecords.clear()` (P2), and misleading javadoc.
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java	Adds `WRITE_BUFFER_SORT_CONTINUOUS_DRAIN_SIZE` option and `CONTINUOUS_SORT` description to `WRITE_BUFFER_TYPE`. Also removes several non-upstream options and changes `INDEX_RLI_CACHE_SIZE` default from 256 to 100 MB.
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/append/AppendWriteFunctions.java	Wires `CONTINUOUS_SORT` into the factory; hardens `resolveSortKeys` to filter empty tokens after comma-split.
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/buffer/BufferType.java	Adds `CONTINUOUS_SORT` enum value with appropriate description.
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/append/ITTestAppendWriteFunctionWithContinuousSort.java	Comprehensive integration tests covering flush triggers, sort correctness, drain configuration, object reuse, and invalid configuration validation.
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/TestWriteBase.java	Adds `preparePipelineWithObjectReuse` harness method; changes `checkLastPendingInstantCompleted` from `checkpointComplete(4)` to `checkpointComplete(3)`.
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/utils/TestData.java	Removes the `numEagerFlush` overload of `writeData`, simplifying to a single `handleEventFromOperator` call; adds `getWritePipelineWithObjectReuse` factory.
hudi-flink-datasource/hudi-flink2.0.x/src/main/java/org/apache/hudi/utils/RuntimeContextUtils.java	Adds `isObjectReuseEnabled` using `runtimeContext.isObjectReuseEnabled()` — correctly uses the Flink 2.0+ direct `RuntimeContext` API instead of going through `ExecutionConfig`.

Sequence Diagram

sequenceDiagram
    participant Source
    participant AWFCS as AppendWriteFunctionWithContinuousSort
    participant TM as TreeMap (SortKey → RowData)
    participant ST as TotalSizeTracer
    participant WH as BulkInsertWriterHelper
    participant Coord as StreamWriteOperatorCoordinator

    Source->>AWFCS: processElement(record)
    AWFCS->>TM: size() >= maxCapacity OR bufferSize > maxBufferSize?
    alt Buffer full
        AWFCS->>WH: initWriterHelper() if null
        loop drainSize times
            AWFCS->>TM: pollFirstEntry() [smallest sort key]
            AWFCS->>WH: write(rowData)
        end
        AWFCS->>ST: countDown(drained * estimatedRecordSize)
    end
    AWFCS->>TM: put(SortKey, rowData) [O(log n)]
    AWFCS->>ST: trace(estimatedRecordSize)

    Note over Source,Coord: On Checkpoint

    Source->>AWFCS: snapshotState()
    loop all remaining records
        AWFCS->>TM: pollFirstEntry()
        AWFCS->>WH: write(rowData)
    end
    AWFCS->>ST: reset()
    AWFCS->>AWFCS: super.snapshotState() → flushData()
    AWFCS->>Coord: WriteMetadataEvent (write statuses)
    Coord->>Coord: commitInstant()

_{Reviews (1): Last reviewed commit: "Address review comments: add memory boun..." | Re-trigger Greptile}

greptile-apps · 2026-04-16T18:33:30Z

+    if (estimatedRecordSize == 0) {
+      estimatedRecordSize = ObjectSizeCalculator.getObjectSize(data);
+    }
+    sizeTracer.trace(estimatedRecordSize);


Memory estimate fixed at first-record size

estimatedRecordSize is computed once from the very first record and never updated. For workloads with variable-length fields (e.g., long VarChar strings or nested structures), subsequent records can be orders of magnitude larger than the first. Because sizeTracer.trace(estimatedRecordSize) and sizeTracer.countDown(drained * estimatedRecordSize) both rely on this single constant, the in-memory buffer can grow far beyond write.task.max_size before the memory-limit drain fires — potentially causing task-manager OOM in production.

Consider computing the size per record (or updating a running weighted average) to keep the memory estimate accurate:

// Per-record size tracking instead of fixed estimate long recordSize = ObjectSizeCalculator.getObjectSize(data); sortedRecords.put(key, data); sizeTracer.trace(recordSize); totalInserted++;

If per-record calculation is too expensive, at minimum update the estimate periodically (e.g., every 1000 records) so drift is bounded.

greptile-apps · 2026-04-16T18:33:31Z

+    if (sortedRecords.size() >= maxCapacity || sizeTracer.bufferSize > sizeTracer.maxBufferSize) {
+      drainRecords(drainSize);
+
+      // Verify there's space after draining
+      if (sortedRecords.size() >= maxCapacity) {
+        throw new HoodieException(
+            String.format("Buffer cannot accept record after draining. "
+                + "Buffer size: %d, maxCapacity: %d, drainSize: %d",
+                sortedRecords.size(), maxCapacity, drainSize));
+      }
+    }


Post-drain exception guard is unreachable

The HoodieException thrown when sortedRecords.size() >= maxCapacity after draining is logically dead code.

If the drain was triggered by sortedRecords.size() >= maxCapacity, then after draining drainSize ≥ 1 records the size becomes maxCapacity - Math.min(drainSize, maxCapacity) < maxCapacity, so the guard can never fire.

If the drain was triggered solely by the memory limit (sizeTracer.bufferSize > sizeTracer.maxBufferSize) with a count below maxCapacity, draining reduces the count further — it cannot reach maxCapacity.

This guard should either be removed or replaced with a memory-limit check if the intent is to detect the case where a single drain cycle is still insufficient to bring the buffer within bounds.

greptile-apps · 2026-04-16T18:33:32Z

+      if (!sortedRecords.isEmpty()) {
+        LOG.info("Snapshot: draining {} remaining records", sortedRecords.size());
+        drainRecords(sortedRecords.size());
+        sortedRecords.clear();
+        insertionSequence = 0L;
+        sizeTracer.reset();
+      }


sortedRecords.clear() is redundant after drainRecords(sortedRecords.size())

drainRecords uses pollFirstEntry() in a loop, which removes each entry from the map as it goes. After draining the full sortedRecords.size() count, the map is already empty. The explicit sortedRecords.clear() call immediately after is a no-op. The same redundancy exists in endInput() (lines 270-276).

Suggested change

if (!sortedRecords.isEmpty()) {

LOG.info("Snapshot: draining {} remaining records", sortedRecords.size());

drainRecords(sortedRecords.size());

sortedRecords.clear();

insertionSequence = 0L;

sizeTracer.reset();

}

if (!sortedRecords.isEmpty()) {

LOG.info("Snapshot: draining {} remaining records", sortedRecords.size());

drainRecords(sortedRecords.size());

insertionSequence = 0L;

sizeTracer.reset();

}

greptile-apps · 2026-04-16T18:33:34Z

+ * <p>Strategy:
+ * <ol>
+ *   <li>Records are inserted in sorted order (TreeMap)</li>
+ *   <li>When buffer reaches max capacity, oldest record(s) are drained synchronously</li>
+ *   <li>Drain size is configurable to balance latency vs. throughput vs compression ratio</li>


Javadoc says "oldest" but implementation drains "smallest sort key"

The class-level javadoc states:

When buffer reaches max capacity, oldest record(s) are drained synchronously

However, the implementation calls sortedRecords.pollFirstEntry(), which removes the record with the smallest sort key — not the oldest by insertion time. Insertion order is only the final tiebreaker for key-equal records. Draining by smallest key is the correct behavior for incremental sorted output, but the documentation should say "smallest by sort key" to avoid confusion.

greptile-apps · 2026-04-16T18:33:36Z

+  @Override
+  public void close() throws Exception {
+    try {
+      LOG.info("AppendWriteFunctionWithContinuousSort closed: totalInserted={}, totalDrained={}, operations={}",
+          totalInserted, totalDrainedRecords, totalDrainOperations);
+
+    } finally {
+      super.close();
+    }
+  }


Silent data loss if close() is called without a prior flush

If the operator is closed unexpectedly (e.g., due to task cancellation before snapshotState() or endInput() is called), any records still in sortedRecords are silently dropped. While in streaming mode this is mitigated by checkpoint replay, a warning log when sortedRecords is non-empty at close would make such situations visible during debugging:

@Override public void close() throws Exception { try { if (sortedRecords != null && !sortedRecords.isEmpty()) { LOG.warn("AppendWriteFunctionWithContinuousSort closed with {} unflushed records — " + "these records will be lost. totalInserted={}, totalDrained={}", sortedRecords.size(), totalInserted, totalDrainedRecords); } else { LOG.info("AppendWriteFunctionWithContinuousSort closed: totalInserted={}, totalDrained={}, operations={}", totalInserted, totalDrainedRecords, totalDrainOperations); } } finally { super.close(); } }

sharath1709 and others added 9 commits February 4, 2026 13:42

fix(concurrency): detect rollback conflicts with ongoing commit opera…

847773a

…tions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Revert "fix(concurrency): detect rollback conflicts with ongoing comm…

d1d2d4f

…it operations" This reverts commit 847773a.

Address review comments: use RuntimeContext.isObjectReuseEnabled() di…

1010cab

…rectly for Flink 2.0+/2.1+ Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai Bot reviewed Apr 16, 2026

View reviewed changes

yihua mentioned this pull request Apr 16, 2026

feat(flink): Implement continuous sorting feature for append write apache/hudi#18083

Merged

3 tasks

greptile-apps Bot reviewed Apr 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OSS PR #18083] feat(flink): Implement continuous sorting feature for append write#46

[OSS PR #18083] feat(flink): Implement continuous sorting feature for append write#46
yihua wants to merge 9 commits into
masterfrom
oss-18083

yihua commented Apr 16, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 16, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 16, 2026

Uh oh!

greptile-apps Bot commented Apr 16, 2026

Uh oh!

greptile-apps Bot Apr 16, 2026

Uh oh!

greptile-apps Bot Apr 16, 2026

Uh oh!

greptile-apps Bot Apr 16, 2026

Uh oh!

greptile-apps Bot Apr 16, 2026

Uh oh!

greptile-apps Bot Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yihua commented Apr 16, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented Apr 16, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yihua commented Apr 16, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 16, 2026 •

edited

Loading