1.2.0-rc.2#71
Conversation
- 在 `StreamEvent` 中新增 `reasoning_redacted_data`、`finish_reason` 和 `stop_reason` 字段,以优化事件处理逻辑。 - 更新 `_thinking_enabled` 函数,使其能够识别包括 `reasoning_effort` 在内的新推理参数。 - 引入 `_coerce_usage_number` 和 `_looks_like_usage_obj` 辅助函数,以改进用量数据的处理机制。 - 增强 `_extract_usage_from_obj` 功能,标准化用量字段并支持推理 Token 的处理。 - 修改 `OpenAIChatClient`,以管理推理历史模式并确保推理内容得到正确处理。 - 新增测试用例,覆盖仅推理状态、推理历史模式及流式行为。 - 优化流式场景下工具调用 ID 的处理逻辑,确保在不同服务提供商间行为的一致性。
…bSocket-auth-token-check feat: 增强安全验证模块,支持 WebSocket 连接的鉴权令牌校验
Reviewer's GuideIntroduce robust streaming retry support, richer reasoning/usage propagation, and safer shutdown/auth handling across the LLM stack, while tightening OpenAI/Anthropic client behavior and expanding test coverage for streaming, reasoning, and security paths. File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- In
OpenAIChatClient.create, the newreasoning_history_modehandling appears inconsistent with the tests and docstrings: you default it toTrueand only enable reasoning history when== True, but the new tests expect the default mode to stripreasoning_contentand string modes like"deepseek"/"kimi"to enable it; consider changing the default and checking for explicit modes/truthiness instead of== Trueso the behavior matches the intended modes. - In
request_execution.execute_request,resp._original_payloads = list(trimmed_payloads)is assigned twice in a row with the same value; this looks like an accidental duplication and can be safely reduced to a single assignment.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `OpenAIChatClient.create`, the new `reasoning_history_mode` handling appears inconsistent with the tests and docstrings: you default it to `True` and only enable reasoning history when `== True`, but the new tests expect the default mode to strip `reasoning_content` and string modes like `"deepseek"`/`"kimi"` to enable it; consider changing the default and checking for explicit modes/truthiness instead of `== True` so the behavior matches the intended modes.
- In `request_execution.execute_request`, `resp._original_payloads = list(trimmed_payloads)` is assigned twice in a row with the same value; this looks like an accidental duplication and can be safely reduced to a single assignment.
## Individual Comments
### Comment 1
<location path="src/kernel/llm/response.py" line_range="556-562" />
<code_context>
- except Exception as exc:
- stream_error = exc
-
- if buffer:
- yield "".join(buffer)
-
- self._apply_stream_result(reducer.finalize(stream_error))
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Flushing the buffer before deciding to retry can cause partial, failed-attempt chunks to be emitted.
In `stream_with_buffer`, the `buffer` is flushed even when an exception occurs, so consumers can see a final partial chunk from a failed attempt followed by overlapping content after a retry. To keep retries transparent, only flush the buffer when `stream_error is None`; on error, discard the partial buffer and let the retry re-emit content. At least, separate the success path from the error+retry path so failed attempts never emit a trailing chunk.
```suggestion
except Exception as exc:
stream_error = exc
# Only flush the buffer when the stream completed successfully.
# On error, discard any partial buffered content so retries remain transparent.
if stream_error is None:
if buffer:
yield "".join(buffer)
buffer.clear()
else:
buffer.clear()
self._apply_stream_result(reducer.finalize(stream_error))
```
</issue_to_address>
### Comment 2
<location path="src/kernel/llm/model_client/openai_client.py" line_range="837-840" />
<code_context>
# force_sync_http 已废弃,移除后不传给 API
extra_params.pop("force_sync_http", None)
+ # 控制是否向兼容供应商发送 reasoning_content 历史字段
+ reasoning_history_mode = extra_params.pop("reasoning_history_mode", True)
client = self._get_client(
</code_context>
<issue_to_address>
**suggestion:** The equality check for `reasoning_history_mode` is very strict and may ignore truthy values like 'true' or 1.
`allow_reasoning_history` will only be enabled when `reasoning_history_mode` is exactly `True`. Common truthy values like `'true'`, `'on'`, or `1` will be treated as `False`. If this is a config/CLI/env-style flag, consider a more permissive check (e.g. casting to `bool` or normalizing strings). If you really require a strict boolean, it would be better to validate the type and fail fast on non-bool values instead of silently treating them as `False`.
Suggested implementation:
```python
# 控制是否向兼容供应商发送 reasoning_content 历史字段
# 接受常见 truthy 值(True, 1, "true", "on", "yes" 等),避免因为严格等于 True 而忽略配置/CLI/env 传入的值
raw_reasoning_history_mode = extra_params.pop("reasoning_history_mode", True)
if isinstance(raw_reasoning_history_mode, str):
reasoning_history_mode = raw_reasoning_history_mode.strip().lower() in {
"1",
"true",
"yes",
"y",
"on",
"t",
}
else:
reasoning_history_mode = bool(raw_reasoning_history_mode)
```
```python
# 默认不向 OpenAI-compatible provider 发送非标准 reasoning_content 历史字段,
# 避免污染请求结构;需要时通过 reasoning_history_mode 显式开启。
allow_reasoning_history = reasoning_history_mode
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| except Exception as exc: | ||
| stream_error = exc | ||
|
|
||
| if buffer: | ||
| yield "".join(buffer) | ||
|
|
||
| self._apply_stream_result(reducer.finalize(stream_error)) |
There was a problem hiding this comment.
suggestion (bug_risk): Flushing the buffer before deciding to retry can cause partial, failed-attempt chunks to be emitted.
In stream_with_buffer, the buffer is flushed even when an exception occurs, so consumers can see a final partial chunk from a failed attempt followed by overlapping content after a retry. To keep retries transparent, only flush the buffer when stream_error is None; on error, discard the partial buffer and let the retry re-emit content. At least, separate the success path from the error+retry path so failed attempts never emit a trailing chunk.
| except Exception as exc: | |
| stream_error = exc | |
| if buffer: | |
| yield "".join(buffer) | |
| self._apply_stream_result(reducer.finalize(stream_error)) | |
| except Exception as exc: | |
| stream_error = exc | |
| # Only flush the buffer when the stream completed successfully. | |
| # On error, discard any partial buffered content so retries remain transparent. | |
| if stream_error is None: | |
| if buffer: | |
| yield "".join(buffer) | |
| buffer.clear() | |
| else: | |
| buffer.clear() | |
| self._apply_stream_result(reducer.finalize(stream_error)) |
| reasoning_history_mode = extra_params.pop("reasoning_history_mode", True) | ||
|
|
||
| client = self._get_client( | ||
| api_key=api_key, |
There was a problem hiding this comment.
suggestion: The equality check for reasoning_history_mode is very strict and may ignore truthy values like 'true' or 1.
allow_reasoning_history will only be enabled when reasoning_history_mode is exactly True. Common truthy values like 'true', 'on', or 1 will be treated as False. If this is a config/CLI/env-style flag, consider a more permissive check (e.g. casting to bool or normalizing strings). If you really require a strict boolean, it would be better to validate the type and fail fast on non-bool values instead of silently treating them as False.
Suggested implementation:
# 控制是否向兼容供应商发送 reasoning_content 历史字段
# 接受常见 truthy 值(True, 1, "true", "on", "yes" 等),避免因为严格等于 True 而忽略配置/CLI/env 传入的值
raw_reasoning_history_mode = extra_params.pop("reasoning_history_mode", True)
if isinstance(raw_reasoning_history_mode, str):
reasoning_history_mode = raw_reasoning_history_mode.strip().lower() in {
"1",
"true",
"yes",
"y",
"on",
"t",
}
else:
reasoning_history_mode = bool(raw_reasoning_history_mode) # 默认不向 OpenAI-compatible provider 发送非标准 reasoning_content 历史字段,
# 避免污染请求结构;需要时通过 reasoning_history_mode 显式开启。
allow_reasoning_history = reasoning_history_mode
Summary by Sourcery
Add robust streaming resilience, richer reasoning/usage handling, and improved shutdown/authentication behavior for LLM and runtime components.
New Features:
Bug Fixes:
Enhancements:
Build:
Tests: