-
Notifications
You must be signed in to change notification settings - Fork 55
Open
Labels
featureNew featureNew feature
Description
Description
When a node becomes unresponsive (e.g. process crash, OOM) but the underlying EC2 instance is still running, health check pings succeed and the node is not flagged as unreachable. This leaves the node in a degraded state — alive but not participating in the cluster.
Proposed Solution
Claudie should detect nodes that are reachable (ping succeeds) but not actively participating in the cluster, and automatically rejoin them back into the cluster.
Expected Behavior
Claudie identifies nodes that are reachable but not part of the cluster
Claudie automatically triggers a rejoin operation for those nodes
The node resumes normal operation within the cluster
Exit criteria
- Implement rejoining of nodes.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
featureNew featureNew feature