[shell-operator] feat/compaction and queue refactor by timmilesdw · Pull Request #789 · flant/shell-operator

timmilesdw · 2025-08-05T09:12:09Z

Overview

This PR introduces a new linked list-based task queue implementation (TaskQueue) alongside the existing slice-based implementation (TaskQueueSlice). The new implementation provides consistent O(1) performance for all queue operations and includes a new compaction function that intelligently merges HookRun tasks for the same hook, significantly reducing queue size and improving processing efficiency.

What this PR does / why we need it

Problem Statement:
The current slice-based queue implementation (TaskQueueSlice) suffers from catastrophic performance degradation for certain operations:

AddFirst: O(n) complexity causing 160-200x slowdown compared to linked list
GetByID: O(n) complexity with 6-53x performance degradation as queue size grows
Missing compaction functionality: No mechanism to merge duplicate or related tasks, leading to queue bloat and inefficient processing

Solution:
This PR introduces a new TaskQueue implementation based on Go's container/list with the following improvements:

Key Features:

Consistent O(1) performance for all operations (AddFirst, AddLast, RemoveFirst, GetFirst, GetByID)
NEW: compaction method for intelligent task merging
Optimized compaction algorithm with object pooling to reduce GC pressure
Memory-efficient design using pre-allocated buffers and sync.Pool
ID-based indexing for O(1) task lookup
Stable performance scaling regardless of queue size

compaction Method:

This PR introduces a brand new compaction method that addresses a critical gap in the existing queue implementation. This method intelligently merges HookRun tasks for the same hook, significantly reducing queue size and improving processing efficiency.

Why Compaction is Essential:

Queue bloat prevention: Merges multiple HookRun tasks for the same hook into single entries
Processing efficiency: Fewer tasks to process means faster queue iteration
Memory optimization: Reduces memory footprint by eliminating redundant tasks
Performance improvement: Smaller queues lead to faster AddFirst/RemoveFirst operations

Compaction Algorithm Features:

O(n) complexity: Single-pass algorithm through the entire queue
Object pooling: Uses sync.Pool for compaction groups, context slices, and monitor ID slices
Memory efficiency: 31x less memory consumption compared to slice implementation
Intelligent merging: Combines multiple HookRun tasks for the same hook into a single task
Processing state awareness: Skips tasks that are currently being processed

Compaction Process:

Group identification: Groups HookRun tasks by hook name
Context aggregation: Combines binding contexts from all tasks in a group
Monitor ID merging: Consolidates monitor IDs from merged tasks
Task consolidation: Creates a single representative task with combined metadata
Queue reconstruction: Rebuilds the queue with merged tasks while preserving order

Performance Improvements (benchmark results on Apple M3):

Operation	Queue Size	Slice	Linked List	Improvement
AddFirst	100	87,233 ns	540 ns	160x faster
AddFirst	1000	99,898 ns	501 ns	200x faster
GetByID	100	123 ns	18.5 ns	6.7x faster
GetByID	1000	1,149 ns	21.4 ns	53.6x faster
Memory Usage	1000	331,368 B	10,629 B	31x more efficient

Compaction-Specific Benefits:

Queue size reduction: Merges multiple tasks for the same hook into single entries
Improved processing efficiency: Fewer tasks to process means faster queue iteration
Memory optimization: Object pooling reduces allocation overhead by 2-3x
GC pressure reduction: Reuse of compaction objects minimizes garbage collection
Predictable performance: O(n) complexity regardless of queue size

Trade-offs:

AddLast is 1.6x slower (513 ns vs 313 ns) but remains in microsecond range
Slightly higher memory overhead due to pointer structures
More complex implementation but provides architectural consistency

Why this matters:

Production stability: Eliminates risk of performance spikes during high-priority task insertion
Scalability: Consistent performance regardless of queue size
Debugging efficiency: Fast task lookup by ID for monitoring and troubleshooting
Memory efficiency: Reduced GC pressure through object pooling
NEW: Queue optimization: Intelligent compaction prevents queue bloat
Future-proofing: Extensible architecture for additional queue operations

Implementation Details:

Uses container/list for O(1) list operations
Implements idIndex map for O(1) task lookup
NEW: Implements compaction with sophisticated merging logic
Employs sync.Pool for compaction groups, context slices, and monitor ID slices
Pre-allocates buffers to minimize runtime allocations
Maintains backward compatibility with existing API

Migration Path:
The existing TaskQueueSlice implementation is preserved mainly for testing and benchmarking

If we add compaction to the slice-based implementation

For completeness, I have also implemented the same compaction logic for the slice-based queue. This allows for a direct, apples-to-apples comparison of compaction performance and memory usage between the two data structures.

Benchmark results for compaction (Apple M3, Go 1.24.0):

Implementation	Queue Size	Compaction Time (ns/op)	Memory Used (B/op)	Allocations/op
Slice (with compaction)	10	3,110	4,616	35
	100	15,791	32,825	58
	500	73,043	164,796	69
	1000	142,857	331,368	76
Linked List	10	2,400	1,462	16
	100	9,438	2,145	20
	500	43,164	5,219	21
	1000	78,860	10,629	23

Key takeaways:

Linked list-based queue is consistently faster for compaction, especially as queue size grows.
Memory usage and allocations are dramatically lower for the linked list implementation (up to 31x less memory at 1000 elements).
While compaction in the slice-based queue is functional, it is much less efficient for large queues due to the need to copy and reallocate slices.

Why this matters:
If you expect your queue to grow large or require frequent compaction (e.g., in high-throughput or long-running systems), the linked list implementation will provide much more predictable and efficient performance, both in terms of speed and memory usage.

Special notes for your reviewer

Performance Considerations:

The 1.6x slowdown in AddLast is acceptable given the microsecond-scale absolute values
Object pooling significantly reduces GC pressure in high-throughput scenarios
Memory efficiency improvements offset the pointer overhead
NEW: Compaction performance scales linearly with queue size

Backward Compatibility:

Existing TaskQueueSlice remains functional
API compatibility maintained across both implementations

Benchmark Environment:

Platform: Apple M3 (ARM64)
Go version: 1.24.0
Test scenarios: 100 and 1000 element queues
Realistic task patterns including HookRun tasks with metadata
NEW: Compaction tests with mixed hook configurations

Signed-off-by: Timur Tuktamyshev <timur.tuktamyshev@flant.com>

timmilesdw force-pushed the feat/compaction-and-queue-refactor branch 6 times, most recently from c078c70 to b5981c4 Compare August 5, 2025 09:30

timmilesdw added the enhancement New feature or request label Aug 5, 2025

timmilesdw self-assigned this Aug 5, 2025

timmilesdw force-pushed the feat/compaction-and-queue-refactor branch 9 times, most recently from d0c2e10 to 8954380 Compare August 5, 2025 10:47

timmilesdw marked this pull request as ready for review August 5, 2025 10:51

timmilesdw force-pushed the feat/compaction-and-queue-refactor branch from 8954380 to 9f1f634 Compare August 5, 2025 11:03

ldmonster self-requested a review August 5, 2025 12:23

ldmonster reviewed Aug 5, 2025

View reviewed changes

Comment thread pkg/task/queue/queue_set.go Outdated

ldmonster reviewed Aug 5, 2025

View reviewed changes