Conversation
Introduce adapter that wraps a no-arg agent factory into a task callable, automatically handling telemetry setup, span collection, and session mapping. Add as an alternative to in and , with mutual exclusivity validation, so users no longer need to manually wire up in-memory exporters and mappers.
…k classes Replace the `create_agent_task` factory function with two class-based task adapters: `AgentTask` for simple agent invocations and `TracedAgentTask` for invocations that also collect OpenTelemetry spans for trajectory evaluation. Additionally, simplify `Experiment.run_evaluations` by removing the `agent_factory` parameter and making `task` required, eliminating ambiguous dual-path configuration. Users now pass an `AgentTask` or `TracedAgentTask` instance directly as the task argument.
…AgentTask Simplify AgentTask and TracedAgentTask API by accepting **agent_kwargs instead of requiring a lambda/callable factory. The classes now directly construct strands.Agent instances internally via a _create_agent method, reducing boilerplate for users (e.g., `AgentTask(model="...")` instead of `AgentTask(lambda: Agent(model="..."))`). Updated tests to verify kwargs forwarding and fresh agent creation per call.
Introduce `EvalTaskHandler` and `TracedHandler` classes along with an `@eval_task` decorator to standardize how task functions are wrapped with evaluation behavior. `EvalTaskHandler` normalizes return values to dicts, while `TracedHandler` adds OpenTelemetry span collection and session mapping for trajectory-based evaluators. Includes public exports and comprehensive tests.
Allow eval_task-decorated functions to either accept a Case argument or take no arguments. When the function returns an Agent instance, the decorator auto-invokes it with case.input and stringifies the result. This simplifies the common pattern where tasks just need to construct and run an agent.
- Add `eval_task` to `__init__.py` imports and `__all__` for direct access - Document TracedHandler concurrency limitations (sequential only or max_workers=1 due to shared span exporter) - Minor test formatting cleanup
Rename eval_task.py to eval_task_handler.py so the module name no longer collides with the eval_task function re-exported in __init__.py. This was causing @patch targets in TestTracedHandler to resolve to the function instead of the module, breaking all 5 TracedHandler tests with AttributeError.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Related Issues
Documentation PR
Type of Change
Bug fix
New feature
Breaking change
Documentation update
Other (please describe):
Testing
How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli
hatch run prepareChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.