hot reload feature proposal by Uzziee · Pull Request #1 · Uzziee/design

Uzziee · 2026-01-28T11:30:03Z

No description provided.

Signed-off-by: Urjit Patel <105218041+Uzziee@users.noreply.github.com>

gunnarmorling

Hey @Uzziee, a few comments/suggestions/questions inline.

gunnarmorling · 2026-01-28T12:46:37Z

proposals/012-hot-reload-feature.md

@@ -0,0 +1,905 @@
+# Hot Reload Feature
+
+As of today, any changes to virtual cluster configs (addition/removal/modification) require a full restart of kroxylicious app. This proposal is to add a dynamic reload feature, which will enable operators to modify virtual cluster configurations (add/remove/modify clusters) while **maintaining service availability for unaffected clusters** without the need for full application restarts. This feature will transform Kroxylicious from a **"restart-to-configure"** system to a **"live-reconfiguration"** system


Suggested change

As of today, any changes to virtual cluster configs (addition/removal/modification) require a full restart of kroxylicious app. This proposal is to add a dynamic reload feature, which will enable operators to modify virtual cluster configurations (add/remove/modify clusters) while **maintaining service availability for unaffected clusters** without the need for full application restarts. This feature will transform Kroxylicious from a **"restart-to-configure"** system to a **"live-reconfiguration"** system

As of today, any changes to virtual cluster configs (addition/removal/modification) require a full restart of kroxylicious app. This proposal is to add a dynamic reload feature, which will enable operators to modify virtual cluster configurations (add/remove/modify clusters) while **maintaining service availability for unaffected clusters** without the need for full application restarts. This feature will transform Kroxylicious from a **"restart-to-configure"** system to a **"live-reconfiguration"** system.

gunnarmorling · 2026-01-28T12:46:44Z

proposals/012-hot-reload-feature.md

@@ -0,0 +1,905 @@
+# Hot Reload Feature
+
+As of today, any changes to virtual cluster configs (addition/removal/modification) require a full restart of kroxylicious app. This proposal is to add a dynamic reload feature, which will enable operators to modify virtual cluster configurations (add/remove/modify clusters) while **maintaining service availability for unaffected clusters** without the need for full application restarts. This feature will transform Kroxylicious from a **"restart-to-configure"** system to a **"live-reconfiguration"** system


Minor: I recommend to start each sentence on a new line (or even wrap after a line of thought in a longer sentence). That way, comments can be made more targeted rather than on a larger chunk of text at once.

gunnarmorling · 2026-01-28T12:46:59Z

proposals/012-hot-reload-feature.md

+
+# Part 1: Configuration Change Detection Framework
+
+With this framework, kroxylicious will be able to detect config file changes (using standard fileWatcher service) and using various detector interfaces, it will figure out which virtual clusters are added/removed or modified. The list of affected clusters will be then passed on to the Part 2 of this feature, where the clusters would be gracefully restarted (or rollbacked to previous stable state in case of any failures )


Suggested change

With this framework, kroxylicious will be able to detect config file changes (using standard fileWatcher service) and using various detector interfaces, it will figure out which virtual clusters are added/removed or modified. The list of affected clusters will be then passed on to the Part 2 of this feature, where the clusters would be gracefully restarted (or rollbacked to previous stable state in case of any failures )

With this framework, Kroxylicious will be able to detect config file changes (using standard fileWatcher service) and using various detector interfaces, it will figure out which virtual clusters are added/removed or modified. The list of affected clusters will be then passed on to the Part 2 of this feature, where the clusters would be gracefully restarted (or rollbacked to previous stable state in case of any failures )

gunnarmorling · 2026-01-28T12:47:14Z

proposals/012-hot-reload-feature.md

+
+# Part 1: Configuration Change Detection Framework
+
+With this framework, kroxylicious will be able to detect config file changes (using standard fileWatcher service) and using various detector interfaces, it will figure out which virtual clusters are added/removed or modified. The list of affected clusters will be then passed on to the Part 2 of this feature, where the clusters would be gracefully restarted (or rollbacked to previous stable state in case of any failures )


So this still mentions using the file watcher for detecting changes, wheras I think the discussion is converging around using an HTTP endpoint to do so? Are you planning to update the proposal in that regard, @Uzziee?

In general, I agree that watching the file system seems not the best approach, in particular for containerized environments, where I'd expect the config to be passed in via a config map. If a config map changes, I'm not sure whether the file system watcher would pick this up (instantly)? Having a HTTP-based trigger to enforce re-reading the config seems like the right thing to do.

gunnarmorling · 2026-01-28T12:47:23Z

proposals/012-hot-reload-feature.md

+
+With this framework, kroxylicious will be able to detect config file changes (using standard fileWatcher service) and using various detector interfaces, it will figure out which virtual clusters are added/removed or modified. The list of affected clusters will be then passed on to the Part 2 of this feature, where the clusters would be gracefully restarted (or rollbacked to previous stable state in case of any failures )
+
+POC PR - https://github.com/kroxylicious/kroxylicious/pull/2901


Closed by now, wanna link to the new one?

gunnarmorling · 2026-01-28T12:49:00Z

proposals/012-hot-reload-feature.md

+public record ConfigurationChangeContext(
+    Configuration oldConfig,
+    Configuration newConfig,
+    List<VirtualClusterModel> oldModels,
+    List<VirtualClusterModel> newModels
+) {}


How will that accommodate changes to config parts other than clusters?

This will also include filter changes, I forgot to add it as part of this proposal as it was added later

gunnarmorling · 2026-01-28T12:49:04Z

proposals/012-hot-reload-feature.md

+public record ChangeResult(
+    List<String> clustersToRemove,
+    List<String> clustersToAdd, 
+    List<String> clustersToModify
+) {}


How will that accommodate changes to config parts other than clusters?

This will also include filter changes, I forgot to add it as part of this proposal as it was added later

gunnarmorling · 2026-01-28T12:49:21Z

proposals/012-hot-reload-feature.md

+
+## Flow diagram
+
+<img width="2354" height="1394" alt="Image" src="https://github.com/user-attachments/assets/19a7edba-8881-4fa7-98c8-defcdbb132da" />


This doesn't render.

Suggested change

<img width="2354" height="1394" alt="Image" src="https://github.com/user-attachments/assets/19a7edba-8881-4fa7-98c8-defcdbb132da" />

[Flow Diagram](https://github.com/user-attachments/assets/19a7edba-8881-4fa7-98c8-defcdbb132da)

gunnarmorling · 2026-01-28T12:49:32Z

proposals/012-hot-reload-feature.md

+
+
+
+# Part 2: Graceful Virtual Cluster Restart


On a high level, I think the flow should be this:

Make changes to the config (e.g. remounting a K8s config map with the file)

Trigger a reload via an HTTP endpoint, which causes the file to be re-read

Identify changes to all parts of the config, exposing them in a generic way (e.g. as a list of change descriptors, each comprising the id of affected config node (cluster id, filter id, etc.) and the old and new config of that node (either one of old/new or both may be present, based on the type of change, i.e. create/update/removal)

Pass these change descriptors to a pluggable set of change handlers; initially, we might only have an handler for cluster changes, while all other changes would be ignored or even rejected)

gunnarmorling · 2026-01-28T12:50:26Z

proposals/012-hot-reload-feature.md

+public class ConnectionTracker {
+
+    // Downstream connections (client → proxy)
+    private final Map<String, AtomicInteger> downstreamConnections = new ConcurrentHashMap<>();


Suggested change

private final Map<String, AtomicInteger> downstreamConnections = new ConcurrentHashMap<>();

private final ConcurrentMap<String, AtomicInteger> downstreamConnections = new ConcurrentHashMap<>();

(Here and elsewhere)

Signed-off-by: Urjit Patel <105218041+Uzziee@users.noreply.github.com>

Rewrite the hot reload proposal to focus on architectural decisions rather than implementation detail. The PR discussion has established consensus on several key points that the document didn't reflect: - Reframe around applyConfiguration(Configuration) as the core API, decoupled from trigger mechanisms (HTTP, file watcher, operator) - Remove all Java class implementations and handler chains — these belong in the code PR where they're reviewable in context - Call out remove+add with brief per-cluster downtime as deliberate - Call out all-or-nothing rollback as the initial default, consistent with startup behaviour - Move ReloadOptions to deployment-level static configuration rather than per-call parameters - Identify plugin resource tracking as a known gap with pointer to separate proposal - Flag open questions (config granularity, failure behaviour options, drain timeout configurability) - Defer trigger mechanism design as explicit future work Assisted-by: Claude claude-opus-4-6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>

- Fix summary to read as proposed behaviour, not existing - Use "administrators" instead of "operators" for humans to avoid confusion with the Kubernetes operator process - Fix filter config examples (KMS endpoint, key selection pattern) - Clarify failure behaviour is consistent across trigger mechanisms - Note thundering herd as a known trade-off of remove+add - Fix "original proposal" to "earlier iterations" Assisted-by: Claude claude-opus-4-6 <noreply@anthropic.com> Signed-off-by: Sam Barker <sam@quadrocket.co.uk>

hot reload feature proposal

f8a3e53

Signed-off-by: Urjit Patel <105218041+Uzziee@users.noreply.github.com>

gunnarmorling reviewed Jan 28, 2026

View reviewed changes

Uzziee and others added 4 commits January 29, 2026 09:45

replace file watcher with HTTP based approach

2a9747e

Signed-off-by: Urjit Patel <105218041+Uzziee@users.noreply.github.com>

Merge pull request #2 from SamBarker/hot-reload-proposal-restructure

c239f68

		@@ -0,0 +1,905 @@
		# Hot Reload Feature

		As of today, any changes to virtual cluster configs (addition/removal/modification) require a full restart of kroxylicious app. This proposal is to add a dynamic reload feature, which will enable operators to modify virtual cluster configurations (add/remove/modify clusters) while maintaining service availability for unaffected clusters without the need for full application restarts. This feature will transform Kroxylicious from a "restart-to-configure" system to a "live-reconfiguration" system


		# Part 1: Configuration Change Detection Framework

		With this framework, kroxylicious will be able to detect config file changes (using standard fileWatcher service) and using various detector interfaces, it will figure out which virtual clusters are added/removed or modified. The list of affected clusters will be then passed on to the Part 2 of this feature, where the clusters would be gracefully restarted (or rollbacked to previous stable state in case of any failures )


		With this framework, kroxylicious will be able to detect config file changes (using standard fileWatcher service) and using various detector interfaces, it will figure out which virtual clusters are added/removed or modified. The list of affected clusters will be then passed on to the Part 2 of this feature, where the clusters would be gracefully restarted (or rollbacked to previous stable state in case of any failures )

		POC PR - https://github.com/kroxylicious/kroxylicious/pull/2901


		## Flow diagram

		<img width="2354" height="1394" alt="Image" src="https://github.com/user-attachments/assets/19a7edba-8881-4fa7-98c8-defcdbb132da" />

	<img width="2354" height="1394" alt="Image" src="https://github.com/user-attachments/assets/19a7edba-8881-4fa7-98c8-defcdbb132da" />
	[Flow Diagram](https://github.com/user-attachments/assets/19a7edba-8881-4fa7-98c8-defcdbb132da)

	private final Map<String, AtomicInteger> downstreamConnections = new ConcurrentHashMap<>();
	private final ConcurrentMap<String, AtomicInteger> downstreamConnections = new ConcurrentHashMap<>();

Conversation

Uzziee commented Jan 28, 2026

Uh oh!

gunnarmorling left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants