Summary
When the framework switches models mid-conversation (via InferenceModelOverride or fallback_models), thinking blocks from a thinking-capable model must be stripped before sending history to a model that doesn't support extended thinking. Currently no such stripping logic exists.
Problem
If conversation history contains ReasoningDelta / ReasoningEncryptedValue events from a thinking-capable model (e.g., Claude Opus), and inference falls back to a non-thinking model, the thinking content in the message history may cause API errors or undefined behavior.
Proposed solution
Add a pre-inference transform (or logic in build_messages() / to_chat_message()) that:
- Detects whether the target model supports extended thinking
- If not, strips all thinking/reasoning content from the message history before sending
- Preserves the text and tool_use content blocks
This could be implemented as:
- A method on
ModelDefinition indicating thinking support
- An
InferenceRequestTransform that conditionally strips thinking blocks
- Or logic in the genai conversion layer (
engine/convert.rs)
Key files
crates/tirea-agentos/src/engine/convert.rs — message conversion to genai format
crates/tirea-agentos/src/runtime/streaming.rs — ReasoningDelta, ReasoningEncryptedValue events
crates/tirea-agentos/src/runtime/loop_runner/core.rs — build_messages()
crates/tirea-agentos/src/composition/registry/model.rs — ModelDefinition
References
- Book Chapter 4 §4.1.3: thinking block processing rules for model switching
Summary
When the framework switches models mid-conversation (via
InferenceModelOverrideorfallback_models), thinking blocks from a thinking-capable model must be stripped before sending history to a model that doesn't support extended thinking. Currently no such stripping logic exists.Problem
If conversation history contains
ReasoningDelta/ReasoningEncryptedValueevents from a thinking-capable model (e.g., Claude Opus), and inference falls back to a non-thinking model, the thinking content in the message history may cause API errors or undefined behavior.Proposed solution
Add a pre-inference transform (or logic in
build_messages()/to_chat_message()) that:This could be implemented as:
ModelDefinitionindicating thinking supportInferenceRequestTransformthat conditionally strips thinking blocksengine/convert.rs)Key files
crates/tirea-agentos/src/engine/convert.rs— message conversion to genai formatcrates/tirea-agentos/src/runtime/streaming.rs—ReasoningDelta,ReasoningEncryptedValueeventscrates/tirea-agentos/src/runtime/loop_runner/core.rs—build_messages()crates/tirea-agentos/src/composition/registry/model.rs—ModelDefinitionReferences