feat(simulator): structured_output for ActorSimulator by poshinchen · Pull Request #207 · strands-agents/evals

poshinchen · 2026-04-30T19:01:26Z

Summary

Adds structured stop-signalling and custom output schema support to ActorSimulator, replacing the <stop/> text sentinel with a proper stop: bool field on the structured response.

Breaking Changes

ActorResponse.message changed from str (required) to str | None (optional, default None)
Stop detection uses ActorResponse.stop == True instead of parsing <stop/> from message text
ActorResponse fields added: stop: bool and stop_reason: str | None

Key Changes

__init__ accepts keyword-only structured_output_model: type[BaseModel] | None, validated at construction and used as the default for all act() calls
act() accepts per-call structured_output_model override; any Pydantic BaseModel with message and stop fields.
Simulator populates stop_reason: "goal_completed" or "max_turns"
system_prompt_template defaults to DEFAULT_USER_SIMULATOR_PROMPT_TEMPLATE and supports pre-rendered strings (no {actor_profile} placeholder required)
Prompt template instructs "set stop=true" instead of "generate <stop/>"

What Changed

`types/simulation/actor.py`

ActorResponse.message: str → str | None = None
ActorResponse gains stop: bool = False and stop_reason: str | None = None

`actor_simulator.py`

structured_output_model kwarg on __init__, validated via _validate_output_model() (checks message and stop fields exist)
act() resolution order: per-call → init-level → ActorResponse
Stop logic reads stop from structured output, sets stop_reason, enforces max_turns
has_next() simplified to return not self.stop
Removed _last_message tracking and <stop/> sentinel parsing

`prompt_templates/actor_system_prompt.py`

Exit conditions: "set stop=true in your structured response" replaces "generate <stop/>"
Fixed typo: "Do no deviate" → "Do not deviate"

Usage

from strands_evals.simulation import ActorSimulator
from pydantic import BaseModel
from strands_evals.types.simulation import ActorProfile

# Any BaseModel with `message` and `stop` fields works
class AgentInput(BaseModel):
    reasoning: str = ""
    stop: bool = False
    message: str | None = None
    urgency: str = "normal"

profile = ActorProfile(
    traits={"expertise_level": "beginner", "communication_style": "casual"},
    context="A user trying to debug a production outage.",
    actor_goal="Get help identifying the root cause of a memory leak.",
)

simulator = ActorSimulator(
    actor_profile=profile,
    initial_query="Our service is running out of memory in prod",
    structured_output_model=AgentInput,
    max_turns=10,
)

agent = Agent(system_prompt="You are an SRE assistant.")

user_message = simulator.initial_query
while simulator.has_next():
    agent_response = agent(user_message)
    result = simulator.act(str(agent_response))
    if result.structured_output.stop:
        break
    user_message = result.structured_output.message

Default usage (no custom model):

simulator = ActorSimulator(
    actor_profile=profile,
    initial_query="Help me debug this",
    max_turns=10,
)

while simulator.has_next():
    agent_response = agent(user_message)
    result = simulator.act(str(agent_response))
    user_message = str(result.structured_output.message)

Testing

25 unit tests covering init-level and per-call structured_output_model, field validation (message and stop required), stop_reason values, custom model stop management, template selection, and default prompt stop=true instruction.

Type of Change

Breaking change + new feature

Checklist

I have read the CONTRIBUTING document
I have added tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
My changes generate no new warnings
I ran hatch run prepare

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

jjbuck · 2026-05-07T04:34:57Z

I want to pitch a slightly different shape for this feature. The issue is that input_type as currently designed conflates three decisions onto one kwarg: prompt template selection, response schema construction, and which method (act vs act_structured) the caller is supposed to use. None of the coordination is enforced, so several combinations can run but produce broken conversations — e.g. input_type=AgentInput + caller uses act() installs the structured prompt (which tells the LLM to set stop=true) but the ActorResponse schema has no stop field, so the actor can't terminate.

Instead, if we wrote....

def __init__(
    self,
    ...,
    *,
    output_type: type[BaseModel] | None = None,
):

....then this would have the following behavior.

When output_type is None, the simulator behaves exactly as it does today. ActorResponse is the schema, the sentinel template is the prompt, <stop/> terminates the conversation, and act() returns the same AgentResult shape that existing callers already consume.
When output_type is set, act() internally uses ActorStructuredResponse[output_type] as the schema and the structured template as the prompt. The actor terminates by setting stop=true on the response. The return type is still AgentResult; what differs is the class the caller finds on structured_output.

This collapses the two methods into one because passing a Pydantic class is already an unambiguous request for structured output. There's nothing additional for a flag to disambiguate. A caller who doesn't want the new behavior doesn't pass the kwarg, and everything works as it does today. A caller who does want it passes the kwarg once at construction and uses act() normally from then on.

So concretely, this would mean

__init__ gains output_type and drops input_type.
act() branches internally on self._output_type to pick the schema.
ActorStructuredResponse becomes a Pydantic generic parameterized on the message type (separate comment on _structured_model goes deeper).
act_structured() and _build_structured_model() are deleted.

jjbuck · 2026-05-06T23:12:03Z

+
+
+def test_init_with_input_type_narrows_message_schema(sample_actor_profile):
+    """With input_type, act_structured() hands a ActorStructuredResponse subclass whose message is typed."""


Nit: "an ActorStructuredResponse"

Allow users to set the structured output model once at construction time instead of passing it on every act() call. The init-level model is used as the default for act() and can still be overridden per-call. Validates at init time: must subclass ActorOutputBase and have a 'message' field. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Simplifies the fallback chain in act() — no need for a triple-or since the instance attribute always holds a valid model. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(simulator): structured_output of simulator

8946944

poshinchen temporarily deployed to auto-approve April 30, 2026 19:01 — with GitHub Actions Inactive

poshinchen changed the title ~~feat(simulator) structured_output for actorSimulator~~ feat(simulator): structured_output for actorSimulator May 1, 2026

poshinchen changed the title ~~feat(simulator): structured_output for actorSimulator~~ feat(simulator): structured_output for ActorSimulator May 1, 2026

padmak30 reviewed May 5, 2026

View reviewed changes

Comment thread src/strands_evals/types/simulation/actor.py Outdated

padmak30 previously approved these changes May 5, 2026

View reviewed changes

Refactor the prompt usage for structured_output

2967b7e

poshinchen dismissed padmak30’s stale review via 2967b7e May 5, 2026 16:42

poshinchen force-pushed the feat/simulator-structured branch from 4eff519 to 2967b7e Compare May 5, 2026 16:42

poshinchen temporarily deployed to auto-approve May 5, 2026 16:43 — with GitHub Actions Inactive

jjbuck requested changes May 7, 2026

View reviewed changes

breaking change to merge the function

fd9e30e

poshinchen temporarily deployed to auto-approve May 7, 2026 18:55 — with GitHub Actions Inactive

poshinchen temporarily deployed to auto-approve May 7, 2026 19:13 — with GitHub Actions Inactive

refactor(simulator): extract _validate_output_model as static method

078dc38

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

poshinchen temporarily deployed to auto-approve May 7, 2026 19:20 — with GitHub Actions Inactive

poshinchen temporarily deployed to auto-approve May 7, 2026 19:22 — with GitHub Actions Inactive

poshinchen force-pushed the feat/simulator-structured branch from ff3461a to 4cdd3bd Compare May 7, 2026 19:48

poshinchen temporarily deployed to auto-approve May 7, 2026 19:49 — with GitHub Actions Inactive

poshinchen force-pushed the feat/simulator-structured branch from 4cdd3bd to da32699 Compare May 7, 2026 20:01

poshinchen temporarily deployed to auto-approve May 7, 2026 20:01 — with GitHub Actions Inactive

poshinchen force-pushed the feat/simulator-structured branch from da32699 to 8ab3f20 Compare May 7, 2026 20:04

poshinchen temporarily deployed to auto-approve May 7, 2026 20:05 — with GitHub Actions Inactive

refactor(simulator): default _structured_output_model to ActorResponse

dce76a4

Simplifies the fallback chain in act() — no need for a triple-or since the instance attribute always holds a valid model. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

poshinchen force-pushed the feat/simulator-structured branch from 8ab3f20 to dce76a4 Compare May 7, 2026 20:26

poshinchen temporarily deployed to auto-approve May 7, 2026 20:26 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(simulator): structured_output for ActorSimulator#207

feat(simulator): structured_output for ActorSimulator#207
poshinchen wants to merge 6 commits intostrands-agents:mainfrom
poshinchen:feat/simulator-structured

poshinchen commented Apr 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

jjbuck commented May 7, 2026

Uh oh!

Uh oh!

Uh oh!

jjbuck May 6, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants



		def test_init_with_input_type_narrows_message_schema(sample_actor_profile):
		"""With input_type, act_structured() hands a ActorStructuredResponse subclass whose message is typed."""

Conversation

poshinchen commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Breaking Changes

Key Changes

What Changed

types/simulation/actor.py

actor_simulator.py

prompt_templates/actor_system_prompt.py

Usage

Testing

Type of Change

Checklist

Uh oh!

Uh oh!

jjbuck commented May 7, 2026

Uh oh!

Uh oh!

Uh oh!

jjbuck May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

poshinchen commented Apr 30, 2026 •

edited

Loading

`types/simulation/actor.py`

`actor_simulator.py`

`prompt_templates/actor_system_prompt.py`