Skip to content

refactor(evaluatorq-py): consolidate TokenUsage / Message types across simulation and redteam (RES-596)#127

Open
currentlycodinng wants to merge 55 commits into
mainfrom
aminaakhmedova/res-596-consolidate-shared-types-tokenusage-message-across
Open

refactor(evaluatorq-py): consolidate TokenUsage / Message types across simulation and redteam (RES-596)#127
currentlycodinng wants to merge 55 commits into
mainfrom
aminaakhmedova/res-596-consolidate-shared-types-tokenusage-message-across

Conversation

@currentlycodinng
Copy link
Copy Markdown
Contributor

Summary

Consolidates duplicate TokenUsage and Message definitions that previously lived in both simulation/types.py and redteam/contracts.py. Shared types now live in evaluatorq/contracts.py (the home introduced by PR #117 for cross-subpackage shared types).

  • TokenUsage — uses redteam's superset (adds calls field + from_completion()).
  • Message — uses redteam's superset (supports tool role, tool_calls, tool_call_id, name).
  • FunctionCall and StrategyToolCall — moved alongside Message.
  • ChatMessage is preserved as a deprecated alias (ChatMessage = Message) to avoid breaking external API consumers.
  • runner/simulation.py no longer does the redundant per-turn ChatMessage(role=..., content=...) conversion.
  • Both legacy import paths still work via re-exports.

Linear: https://linear.app/orqai/issue/RES-596

Test plan

  • bunx nx typecheck evaluatorq-py
  • bunx nx test evaluatorq-py (simulation + redteam suites)
  • Verify legacy imports still resolve (from evaluatorq.simulation.types import TokenUsage, from evaluatorq.redteam.contracts import Message)

KarinaKKarinaK and others added 30 commits April 23, 2026 16:15
…rn AgentResponse, add scorer tests (RES-651)
Add tool_calls_per_turn to JobOutputPayload and RedTeamResult so intercepted
tool calls from OrchestratorResult are accessible in final reports.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…terception (RES-651)

  - Add default=str to json.dumps in evaluator to prevent silent TypeError on non-serializable tool arguments
  - Add AIMessage type guard in LangGraphTarget to prevent phantom tool calls from HumanMessage/ToolMessage
  - Track _prev_msg_count in LangGraphTarget to extract only current-turn tool calls in multi-turn flows
  - Emit logger.warning in reset_conversation() when _prev_msg_count > 0 to surface stale-data risk
  - Narrow except clause and add logger.warning in orq.py JSON arg parse for observability
  - Update tests to reflect AgentResponse return type and new reset_conversation semantics
…nse}} substitution (RES-651)

Response is the terminal substitution — no further placeholder expansion follows it.
Sanitizing here corrupts legitimate agent output containing {{ (e.g. template syntax explanations).
Cross-expansion is already prevented by sanitizing {{input.all_messages}} and {{output.tool_calls}}.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…essage list (RES-651)

Replace separate text/tool_calls/intermediate_steps fields on AgentResponse
with a unified output: list[OutputMessage], preserving the order in which
items were produced. This allows evaluators to reason about tool call order
and interleaving with text responses.

OutputMessage is a union of:
- TextOutputItem (type: "output_text") — assistant text response
- ToolCallOutputItem (type: "function_call") — tool call with optional result
- ReasoningOutputItem (type: "reasoning") — chain-of-thought / thinking steps

Type discriminators align with the OpenResponses intermediate data format.
Backward-compatible .text and .tool_calls properties are preserved.
All backends and integration targets updated to build the output list.
…eption (RES-651)

- Fix ORQ test mocks: use attribute assignment instead of MagicMock constructor
  kwargs so _extract_text correctly resolves part.kind and part.text; add
  response.text assertions to cover text extraction path
- Remove dead/orphaned code from test_extracts_tool_call_with_dict_arguments
  (no-op with patch block and unused call_results list)
- Add test_reset_uses_different_thread_id to verify thread_id changes after
  reset_conversation() in LangGraphTarget
- Fix StrategyToolCall rename (ToolCall → StrategyToolCall) in contracts.py
  and __init__.py exports
- Fix orchestrator tool_calls_per_turn tracking: replace boolean flag with
  _turn_calls local variable per turn to eliminate double-append race
- Fix CallableTarget to re-raise TypeError and AttributeError directly instead
  of wrapping in RuntimeError
- Merge TokenUsageCollector into LangGraphTarget.send_prompt (kept AgentResponse return type)
- Add reset_conversation() and consume_last_token_usage() to LangGraphTarget
- Add clone()/reset_conversation() to OpenAIAgentTarget for new tests from main
- Keep _coerce_to_agent_response for all static pipeline send_prompt call sites
- Use getattr for msg.tool_calls in OpenAIModelTarget to support mock clients
- Integrate LLMCallConfig/LLMConfig/xml_escape imports from main
- Remove duplicate except asyncio.TimeoutError block (merge artifact in evaluator.py)
- Remove test_timeout_error_propagates (superseded by main's inconclusive-on-timeout behavior)
- Fix tests: add evaluator._cfg=LLMCallConfig() for __new__-constructed OWASPEvaluator, use new() not clone() for LangGraphTarget
…erge

- Add ExecutedToolCall to evaluator.py imports
- Pass annotations=[] to TextOutputItem constructors (Annotated Field default not recognized by basedpyright)
- Serialize dict arguments to JSON strings for ToolCallOutputItem constructors
- Use append pattern instead of list comprehension to avoid list invariance errors
- Use cast(Any, ...) in tests for intentionally-wrong-type assertions
Move LLMCallConfig, AgentResponse, ExecutedToolCall and their supporting
types (ReasoningOutputItem, OutputMessage) to the new shared
evaluatorq/contracts.py. Add api field (Literal["chat_completions",
"responses"], default "chat_completions") to LLMCallConfig. Re-export
all moved names from evaluatorq.redteam.contracts so existing import
paths remain unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Baukebrenninkmeijer and others added 9 commits May 5, 2026 17:10
…t, and AgentTarget conformance

Covers BaseAgent._call_llm API dispatch, OrqResponsesTarget call/send_prompt/new/get_usage/timeout
behaviour, and OrqResponsesTarget conformance with the redteam AgentTarget protocol.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…put extractor

Create `simulation/_client.py` with two shared helpers:
- `build_simulation_client()`: replaces near-identical `_build_client` bodies in
  `BaseAgent` and `OrqResponsesTarget`; both now delegate to it.
- `extract_responses_output()`: replaces fragile `hasattr(item, 'name')` discriminator
  in `base.py._call_responses` and the bespoke `_extract_output` method in `target.py`
  with a single type-correct implementation using `item.type == 'function_call'`.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tion failures

Wrap the `user_simulator.update_context()` call in a try/except so that
errors from a badly-implemented injected user simulator are reported with a
clear `RuntimeError` message rather than being swallowed into a generic
simulation error, making debugging much easier for callers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… and update_context propagation

- test_simulate_with_injected_target: verifies target= callback is invoked per turn
- test_simulate_target_takes_precedence_over_target_callback: confirms target= wins over target_callback=
- test_simulate_with_injected_user_simulator: verifies injected simulator's generate_first_message is called
- test_update_context_called_per_simulation: asserts update_context receives persona/scenario kwargs
- test_invalid_user_simulator_raises_type_error: confirms bad user_simulator surfaces as error result
- Update test_timeout_exceeded_raises to expect RuntimeError (wrapping TimeoutError) from the new
  _invoke implementation that converts asyncio.TimeoutError to a descriptive RuntimeError

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…, validation timing, and silent failures (RES-595)

Comprehensive fixes to address multi-agent review findings:

**OrqResponsesTarget (target.py):**
- Fix context duplication: reset _previous_response_id=None in __call__ to avoid threading with full message history
- Fix memory entity scoping: generate fresh UUID in new() for independent AgentTarget instances
- Fix silent failures: raise RuntimeError for empty output items instead of logging warning

**SimulationRunner (simulation.py):**
- Fix late validation: move agent validation from run() to __init__ for fail-fast behavior
- Fix duplicated API logic: use shared build_simulation_client() helper with proper ownership tracking
- Update close() to only close owned clients

**Tests:**
- Update test_call_raises_error_on_no_output_items to expect RuntimeError instead of empty string

All critical issues from Claude review comments now addressed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…SON serialization (RES-651)

- LangGraph & OpenAI Agents targets: emit AgentResponse.output items in
  message order so ReAct-style text -> tool_call -> text sequences are
  preserved instead of collapsing to [tool_calls..., final_text].
- LangGraph target: pass default=str to json.dumps for tool call arguments
  to avoid TypeError on non-serializable values (datetime, bytes, custom
  SDK objects). pipeline.py re-raises TypeError unconditionally, which
  would otherwise crash the whole attack job.
- openresponses.FunctionCall._serialize_arguments: retry with default=str
  before falling back to '{}' and log a warning so silent data loss is
  observable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…S-782) (#118)

* feat(evaluatorq-py): port OTel tracing to agent simulation module (RES-782)

Mirrors the TypeScript simulation tracing landed in RES-595 to the Python
package, so simulations show as a single unified trace in the Orq dashboard
with per-turn breakdown.

- Add simulation/tracing.py: with_simulation_span, with_llm_span,
  record_llm_input/output/response, record_token_usage, set_span_attrs,
  get_trace_context_headers (W3C traceparent injection)
- Wrap simulate()/generate_and_simulate() pipelines, run, turn,
  target_call, judge_evaluation, user_simulator_call, and the persona/
  scenario/first_message generators in nested orq.simulation.* spans
- Wrap _call_chat_completions and _call_responses in BaseAgent with
  with_llm_span (operation = "chat" / "responses"), record input/response
  via GenAI semconv, propagate trace context to the router via
  extra_headers so router-side spans link as children
- Aggregate token usage on the run + pipeline spans; set termination /
  goal_achieved attrs

Tested live against the Orq router: 11-span trace landed with full input/
output capture, parent links, token rollup, and billing all correct.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(evaluatorq-py): address review findings on simulation tracing (RES-782)

Fixes from internal review of 981af84:

Bugs
- record_llm_response: zero token counts (e.g. fully-cached request) no
  longer fall back to alternate attribute names; explicit None checks
- record_llm_response: Responses API output extraction now descends into
  item.content[*].text (and prefers response.output_text when available)
  instead of looking for a flat .text on top-level items, which the SDK
  shape doesn't have
- runner.run() error path: dropped dead inner try/except and restored
  partial token-usage reporting via a usage_holder dict so the outer
  except can call _get_total_usage from agents created in _run_inner
  (mirrors TS getTotalUsage closure)

Parity
- generate_structured: wrap with with_llm_span so persona/scenario
  generation produces GenAI client child spans (structured + json_object
  fallback paths both covered); inject W3C traceparent
- ScenarioGenerator alt methods (with_coverage, edge_cases, boundary,
  security) and PersonaGenerator.generate_with_coverage now wrap their
  generate_structured call in orq.simulation.{persona,scenario}_generation
  spans with the matching mode attribute, matching TS

Tests
- record_token_usage_preserves_zero_prompt_tokens (regression for the
  falsy-or bug)
- record_llm_response_responses_api_shape (item.content[].text path)
- llm_span_records_exception (mirrors with_simulation_span coverage)
- traceparent_injected_into_chat_completions_call (verifies extra_headers
  carries traceparent into the OpenAI client request)
- concurrent_runs_share_pipeline_parent (asyncio.gather context isolation)

Verified live against the Orq router post-fix; trace
bd15b852556fa1283f2c6bd69982d8b9 has the expected 11-span tree, full
input/output capture, and correct token rollup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(evaluatorq-py): address Claude review on simulation tracing PR (RES-782)

Findings from the PR review on #118:

- _serialize_messages: switch to m.get() so messages without a content
  key (OpenAI tool-call-only messages) don't raise KeyError; also avoids
  stringifying None into the literal "None"
- record_llm_response: rename second `choices` to `finish_choices` so
  reading the function doesn't suggest the assignments are linked
- run_span / pipeline_span: type as Span | None under TYPE_CHECKING
  instead of Any so attribute access stays type-checked downstream
- Generators (persona generate_with_coverage, scenario {with_coverage,
  edge_cases, boundary, security}): move parsing and post-processing
  inside the simulation span so timing and exception attribution are
  accurate (matches the basic generate() shape)
- _capture_message_content: tighten docstring to make the deliberate
  spec deviation explicit (default True for TS parity vs spec opt-in)

Skipped: the structured_output fallback-path billing attribution note
— the failed parse() call's APIStatusError doesn't carry a usable
response object to record on the span.

Verified live: trace 8cd51c85d26bf1a8960932b0ac0c66ac landed with the
expected 11-span tree, full input/output capture, and correct token
rollup (3259 = 1234 + 529 + 1496).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(evaluatorq-py): address second Claude review pass on PR #118 (RES-782)

- with_simulation_span / with_llm_span: catch BaseException instead of
  Exception so asyncio.CancelledError (used by asyncio.wait_for timeouts)
  is recorded on the span as ERROR; previously timed-out spans ended
  with UNSET status, indistinguishable from normal completion in the
  Orq dashboard. KeyboardInterrupt / SystemExit also now appear in
  traces, which is desirable for visibility.
- ScenarioGenerator.generate(): move the json.JSONDecodeError handler
  outside the with_simulation_span block so JSON parse failures mark
  the span ERROR (matches the pattern already used by the alt
  generators: with_coverage, edge_cases, boundary, security).

Test added: test_simulation_span_records_asyncio_cancellation.
18/18 tracing tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(evaluatorq-py): third Claude review pass — Responses API finish reason + LLM span cancellation test (RES-782)

- record_llm_response: fall back to response.status when there are no
  .choices, so gen_ai.response.finish_reasons is populated for Responses
  API spans (the API uses .status — e.g. "completed", "failed",
  "incomplete" — instead of per-choice finish_reason)
- Tests: add test_llm_span_records_asyncio_cancellation as a mirror of
  the simulation-span cancellation test, plus
  test_record_llm_response_responses_api_finish_reason for the
  finish_reasons fallback. 20/20 tracing tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(evaluatorq-py): close tracing gaps in simulation spans

* fix(evaluatorq-py): record run span metadata on cancellation

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Bauke Brenninkmeijer <bauke.brenninkmeijer@gmail.com>
Unifies the AgentTarget protocol around `send_prompt() -> AgentResponse`.
`AgentResponse` now carries both the ordered output items (text + tool
calls) and the per-call LLM metadata that previously lived on the now-
removed `SendResult` (usage, model, response_id, finish_reason). A
`SendResult = AgentResponse` alias is kept so existing tests/callers
that import `SendResult` continue to work; new code should use
`AgentResponse` directly.

Updates:
- All integrations (callable, LangGraph, OpenAI Agents, Vercel AI SDK)
  populate `AgentResponse.usage` from their respective usage surfaces.
- OpenAI and ORQ backends preserve tool call extraction alongside
  usage/model/response_id propagation.
- Orchestrator, pipeline, and runner read `usage` directly off the
  returned `AgentResponse`; the `consume_last_token_usage` shim is
  retained on the optional `SupportsTokenUsage` protocol but no longer
  wired into hot paths.
- Tests migrated from `send_prompt_with_usage` to `send_prompt`.
- `test_legacy_target_adapter.py` removed — its premise (distinct
  legacy `send_prompt` vs. new `send_prompt_with_usage`) no longer
  applies.

Pre-existing CLI help tests (`TestVulnerabilityHelpText`) remain
deselected; failures are unrelated to this merge (empty typer help
output in the test environment).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s simulation and redteam (RES-596)

Eliminates duplicate type definitions across simulation and redteam:

Shared types now live in evaluatorq.contracts (the home introduced by
PR #117 for cross-subpackage shared types):
- ``TokenUsage`` — uses redteam's superset (adds ``calls`` field +
  ``from_completion()`` classmethod). Replaces simulation's lighter
  variant.
- ``Message`` — uses redteam's superset (supports ``"tool"`` role,
  ``tool_calls``, ``tool_call_id``, ``name``). Replaces simulation's
  simpler ``Message`` and the duplicated ``ChatMessage``.
- ``FunctionCall`` and ``StrategyToolCall`` — moved alongside Message
  since Message references them.

Both ``simulation/types.py`` and ``redteam/contracts.py`` now re-export
these from ``evaluatorq.contracts`` so legacy import paths
(``from evaluatorq.simulation.types import TokenUsage`` /
``from evaluatorq.redteam.contracts import Message``) keep working.

``ChatMessage`` is preserved as a deprecated alias (``ChatMessage =
Message``) so external API consumers don't break. Internally,
``runner/simulation.py`` now passes ``list[Message]`` directly without
the redundant per-turn ``ChatMessage(role=m.role, content=m.content)``
conversion (the two classes had identical schemas).

Other cleanups:
- Simulation enums (CommunicationStyle, StartingEmotion, EmotionalArc,
  CulturalContext, ConversationStrategy, InputFormat, TerminatedBy)
  now use the same StrEnum polyfill pattern as redteam/contracts.py
  instead of plain ``(str, Enum)``.
- Adapter callbacks coerce ``m.content or ""`` since shared Message
  has optional content (for tool-call assistant messages).

Verified: 832/833 sim+redteam tests pass (1 pre-existing failure
unrelated). 0 typecheck errors. Live smoke produces full simulation
trace, indexed in Orq dashboard.

Stacked on PR #118 (RES-782 OTel tracing); peer of RES-598 branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@claude

This comment was marked as resolved.

Comment thread packages/evaluatorq-py/src/evaluatorq/simulation/runner/simulation.py Outdated
Baukebrenninkmeijer and others added 5 commits May 13, 2026 10:04
Keeps HEAD re-exports from evaluatorq.contracts over inline defs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…r (RES-596)

Address Claude review on PR #127. _invert_roles_for_simulator was
reconstructing Message(role=..., content=...) directly, which silently
dropped tool_calls, tool_call_id, and name now that the shared Message
type carries those fields. Switch to model_copy(update={"role": ...})
so all other fields are preserved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three follow-ups from the review pass on PR #127:

1. Increment ``calls`` field on token aggregation. Adopting redteam's
   superset ``TokenUsage`` added a ``calls`` counter, but simulation
   token accumulators in ``agents/base.py`` (chat completions and
   Responses API paths) and ``target.py`` (OrqResponsesTarget) were
   only summing the three token counts and never bumping ``calls``.
   ``runner/simulation._get_total_usage`` and ``_build_turn_metrics``
   now include ``calls`` in their sum/diff so per-turn and total usage
   report it accurately.

2. Consolidate StrEnum polyfill. Both ``simulation/types.py`` and
   ``redteam/contracts.py`` carried their own Python 3.10 polyfill;
   moved to ``evaluatorq.contracts`` and import from there.

3. ``OrqResponsesTarget._messages_to_input`` now passes ``m.content
   or ""`` so a future tool-role message with ``content=None`` can't
   crash the Responses API call.

Plus two regression tests in ``test_runner.py``:
- ``_invert_roles_for_simulator`` preserves ``tool_calls``,
  ``tool_call_id``, ``name`` (covers the bug Claude's first review
  caught and the prior commit fixed).
- ``Message`` with ``role="tool"`` round-trips through model_dump /
  model_validate without losing the new optional fields.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolves conflicts in redteam/contracts.py where the base branch
re-introduced inline ``Message`` and ``TokenUsage`` definitions while
this branch had moved both to ``evaluatorq.contracts``. Kept the
re-export approach (single source of truth in ``evaluatorq.contracts``).

Picked up base's hardened TokenUsage and applied to the shared
location:
- ``model_config = ConfigDict(frozen=True)`` (immutable for
  predictable arithmetic).
- ``ge=0`` validators on all token fields.
- Improved ``from_completion`` total fallback (uses prompt+completion
  when provider reports total=0/None).
- ``__add__`` / ``__radd__`` for ergonomic accumulation and ``sum()``.

Updated simulation token accumulators in ``agents/base.py`` and
``target.py`` to use ``__add__`` reassignment instead of in-place
``+=`` mutation, since frozen models reject ``setattr``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread packages/evaluatorq-py/src/evaluatorq/simulation/target.py Outdated
Comment thread packages/evaluatorq-py/src/evaluatorq/contracts.py Outdated
Base automatically changed from bauke/sim-orq-responses-support to main May 14, 2026 15:07
PR #117 (Orq Responses v3) and PR #100 (tool call interception) landed
on main, both touching the same simulation/redteam surface RES-596 was
refactoring.

Resolution:
- Took main's version for files where 596's only divergence was the
  shared bauke/karinakalicka branch ancestry (_client, api, generators,
  tracing, structured_output, all test_* files).
- Reconciled the 5 substantive conflicts (evaluatorq/contracts,
  redteam/contracts, simulation/agents/base, simulation/runner/simulation,
  simulation/target) by starting from main's feature-complete versions
  and applying 596's consolidation deltas on top:
  * Moved TokenUsage / Message / FunctionCall / StrategyToolCall into
    evaluatorq/contracts.py (joining LLMCallConfig + AgentResponse that
    #117 already moved there).
  * redteam/contracts.py and simulation/types.py now re-export from
    evaluatorq.contracts.
  * Dropped ExecutedToolCall — #100 removed it; AgentResponse.tool_calls
    returns list[ToolCallOutputItem] directly.
  * Runner passes Message all the way through (no ChatMessage rebuild),
    preserving tool_calls on role inversion (commit e083bfd).
  * TokenUsage.from_completion() + frozen-model accumulation pattern
    threaded through agents/base.py and target.py.

Verified: 1243 unit tests pass, basedpyright clean.
Comment thread packages/evaluatorq-py/src/evaluatorq/simulation/types.py Outdated
- target.py: _messages_to_input now preserves tool_calls / tool_call_id /
  name when serializing to the Responses API, so callers who build a
  history with tool messages don't hit silent data loss.
- contracts.py: drop StrEnum from __all__ — it's a Python-version
  compatibility shim, not a public domain type.
- simulation/types.py: move the Message/TokenUsage import to the top of
  the module with the other imports; removes the # noqa: E402 suppressor.
Copy link
Copy Markdown
Contributor

@KarinaKKarinaK KarinaKKarinaK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simulation/target.py:247 — m.content or "" turns None into "" for assistant messages, but assistant messages with tool_calls need content: null
so I think a fix could be: skip the fallback when role == "assistant"?

more concretely:
simulation/target.py:247 (_messages_to_input) — when an assistant message has tool_calls, OpenAI expects content: null, not "". The new m.content or "" sends content="" alongside tool_calls=[...], which is the wrong shape. Don't apply the or "" fallback to assistant messages.

…sages

In _messages_to_input, the 'or ""' fallback turned None into "" for
assistant messages with tool_calls. OpenAI's spec requires content: null
in that case, so the previous coercion produced an invalid payload.
Keep the fallback for non-assistant roles where the API expects a string.
Comment thread packages/evaluatorq-py/src/evaluatorq/simulation/types.py Outdated
…ion/types.py

The Message section header was left behind when the type itself moved
into evaluatorq.contracts. Pure cleanup, no functional change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants