Skip to content

fix: handle OpAMP AgentDisconnect message#6792

Open
michel-laterman wants to merge 2 commits intoelastic:mainfrom
michel-laterman:fix/opamp-agent-disconnect
Open

fix: handle OpAMP AgentDisconnect message#6792
michel-laterman wants to merge 2 commits intoelastic:mainfrom
michel-laterman:fix/opamp-agent-disconnect

Conversation

@michel-laterman
Copy link
Copy Markdown
Contributor

@michel-laterman michel-laterman commented Apr 8, 2026

What is the problem this PR solves?

Fleet-server ignores the AgentDisconnect field in OpAMP AgentToServer messages, so agents that explicitly disconnect are not updated to offiline until Kibana's 5-minute last_checkin timeout expires.

How does this PR solve the problem?

When an enrolled agent sends an AgentDisconnect message, fleet-server sets its last_checkin_status to disconnected via the bulk checkin system. If an unenrolled agent sends a disconnect message, a BadRequest error is returned.

Design Checklist

  • I have ensured my design is stateless and will work when multiple fleet-server instances are behind a load balancer.
  • I have or intend to scale test my changes, ensuring it will work reliably with 100K+ agents connected.
  • I have included fail safe mechanisms to limit the load on fleet-server: rate limiting, circuit breakers, caching, load shedding, etc.

Checklist

  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool

Related issues

@michel-laterman michel-laterman added the bug Something isn't working label Apr 8, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Apr 8, 2026

This pull request does not have a backport label. Could you fix it @michel-laterman? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-./d./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

When an enrolled agent sends an AgentDisconnect message, set its
last_checkin_status to "disconnected" via the bulk checkin system.
Return a BadRequest error if an unenrolled agent sends a disconnect
message.

Closes elastic#6784

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@michel-laterman michel-laterman force-pushed the fix/opamp-agent-disconnect branch from 5829cf2 to 6b172e7 Compare April 8, 2026 19:18
@michel-laterman michel-laterman marked this pull request as ready for review April 8, 2026 19:24
@michel-laterman michel-laterman requested a review from a team as a code owner April 8, 2026 19:24
@michel-laterman michel-laterman added Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team backport-9.4 labels Apr 8, 2026
michel-laterman added a commit to michel-laterman/kibana that referenced this pull request Apr 9, 2026
When fleet-server reports a disconnected agent via the last_checkin_status
field, the agent status runtime field now emits 'offline' immediately
rather than waiting for the time-based offline threshold.

Related fleet-server change: elastic/fleet-server#6792
Related issue: elastic/fleet-server#6784

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ycombinator
ycombinator previously approved these changes Apr 9, 2026
Copy link
Copy Markdown
Contributor

@ycombinator ycombinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for filling this gap!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-9.4 bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[OpAMP] server should detect client disconnects

3 participants