Skip to content

OTEL context token detached across asyncio/thread boundary in memory client + Strands session_manager #456

@jeffg-onramp

Description

@jeffg-onramp

Summary

When using bedrock-agentcore (1.6.2) with the Strands integration inside an AgentCore runtime, the OpenTelemetry runtime logs a steady stream of Failed to detach context errors:

ERROR [opentelemetry.context] [__init__.py:157] - Failed to detach context
Traceback (most recent call last):
  File "/var/task/opentelemetry/context/__init__.py", line 155, in detach
    _RUNTIME_CONTEXT.detach(token)
  File "/var/task/opentelemetry/context/contextvars_context.py", line 53, in detach
    self._current_context.reset(token)
ValueError: <Token var=<ContextVar name='current_context' default={} at 0x...> at 0x...> was created in a different Context

The errors are logged from spans owned by:

  • bedrock_agentcore.memory.client (around create_event / client.py:484)
  • bedrock_agentcore.memory.integrations.strands.session_manager (session_manager.py:406, around Created agent: default in session: …)

Functionally tracing still works (events are created, agents start), but ERROR-level log spam pollutes CloudWatch and obscures real failures.

Root cause (best read)

The SDK calls opentelemetry.context.attach(ctx) to set span context, then opentelemetry.context.detach(token) to restore. Per PEP 567, every asyncio.Task (and every threadpool worker) gets its own copy of the contextvars state, so a token created in one execution context is not valid to reset() in another.

This shows up specifically when:

  1. A boto3 sync call (e.g., CreateEvent) is invoked from inside an async method and ends up resuming on a different thread / task than the one that called attach.
  2. A Strands session manager hook (e.g. Created agent event) attaches in one task and detaches after an await resumes on another task.

Repro environment

  • bedrock-agentcore==1.6.2
  • bedrock-agentcore-starter-toolkit==0.3.5
  • AgentCore-managed runtime (Lambda-style), aws-opentelemetry-distro enabled
  • Strands Agent constructed with AgentCoreMemorySessionManager
  • Concurrent module-builder agents launched from a ThreadPoolExecutor

Expected

detach() should not raise (or the SDK should not call it across context boundaries). Either:

  • Use with use_span(...) / set_value patterns that don't require explicit detach, or
  • Capture the originating context and only detach if the current context still owns the token, or
  • Wrap the detach in a try/except ValueError since a stale token can be safely ignored.

Workaround consumers can apply

from opentelemetry import context as _ctx
_orig_detach = _ctx.detach
def _safe_detach(token):
    try:
        _orig_detach(token)
    except ValueError:
        pass
_ctx.detach = _safe_detach

…installed before importing bedrock_agentcore. This is purely cosmetic and should live in the SDK instead.

Suggested fix

In bedrock_agentcore.memory.client and bedrock_agentcore.memory.integrations.strands.session_manager, wrap each context.detach(token) call with a try/except ValueError (or refactor to use opentelemetry.trace.use_span so no manual token management is needed across await boundaries).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions