feat: 主动消息配图(自动识别生图工具 / 先图后文 / 提示词插件端生成)#73#79
Conversation
实现 issue DBJD-CR#73:主动消息可根据文本内容生成配图,提供「不生图 / Bot 自行 判断 / 直接生图」三种模式,群聊与私聊各自独立配置。 实现要点: - provider 无关:通过工具循环 Agent + get_full_tool_set 暴露的全部已注册 LLM 工具,由主模型自行判断并调用任意已安装的生图插件,不绑定 aiimg 等 特定插件;未安装任何生图插件时自动降级为纯文本。 - 图片回流自管:新增「捕获型」合成事件(继承 AstrMessageEvent,重写 send 拦截图片到缓冲区),使生图插件产出的图片不直接发往平台,而是交还本插件, 统一走既有分段 / 装饰钩子 / 平台历史持久化的发送流程。 - 不改动 AstrBot 源码:仅继承公开基类、调用 context 公开 API。 - 失败静默降级:生图任何环节出错只记录日志并回退纯文本,绝不把错误信息塞进 发送给用户的消息内容。 新增 core/image_generator.py;_conf_schema.json 为 friend/group 各增 image_settings; message_sender 在文本发送后接入配图发送;main.py 注册 ImageMixin。 Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>
在 issue DBJD-CR#73 配图功能基础上: 1. 先图后文,修复文本不等图。 2. 关键词双线自动识别生图工具(可 extra 补 / exclude 排)+ 预选缓存 + 3 次退避。 3. 配图提示词由插件端先生成画面描述,再交给 Agent 据此调用生图工具。 Generated with Claude Code via Happy Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>
审阅者指南通过一个 tool-loop agent 增加了主动消息图片生成支持,包括图片捕获事件、可配置的 auto/always 模式、工具预热与缓存,以及与主动消息发送流水线的集成。 主动消息图片生成与捕获的时序图sequenceDiagram
participant ProactiveChatPlugin
participant ImageMixin
participant Context
participant ToolManager
participant ToolLoopAgent as tool_loop_agent
participant CaptureEvent as _ImageCaptureEvent
participant ImageTool as ImagePlugin
ProactiveChatPlugin->>ImageMixin: _maybe_generate_proactive_images(session_id, text, session_config)
alt image_settings.mode is off
ImageMixin-->>ProactiveChatPlugin: []
else auto or always
ImageMixin->>ImageMixin: _run_image_agent(session_id, text, image_conf, mode)
ImageMixin->>Context: get_current_chat_provider_id(session_id)
Context-->>ImageMixin: provider_id
ImageMixin->>Context: get_llm_tool_manager()
Context-->>ImageMixin: ToolManager
ImageMixin->>ImageMixin: _ensure_image_tool_names(ToolManager, image_conf)
ImageMixin-->>ImageMixin: tool_names
ImageMixin->>Context: llm_generate(provider_id, prompt, system_prompt)
Context-->>ImageMixin: completion_text
ImageMixin->>ImageMixin: _generate_image_prompt(provider_id, text, image_conf, mode)
ImageMixin-->>ImageMixin: image_prompt
ImageMixin->>CaptureEvent: _ImageCaptureEvent(context, session, text, message_type)
ImageMixin->>Context: tool_loop_agent(CaptureEvent, provider_id, user_prompt, tools, system_prompt, max_steps)
Context->>ToolLoopAgent: tool_loop_agent(...)
ToolLoopAgent->>ImageTool: call_local_llm_tool(...)
ImageTool->>CaptureEvent: send(MessageChain[Image, Plain])
CaptureEvent->>CaptureEvent: captured_images.append(Image)
Context-->>ImageMixin: tool_loop_agent finished
ImageMixin-->>ProactiveChatPlugin: images
end
loop for each image
ProactiveChatPlugin->>ProactiveChatPlugin: _send_chain_with_hooks(session_id, [image])
end
ProactiveChatPlugin->>ProactiveChatPlugin: send text after images
文件级变更
可能关联的 Issue
提示与命令与 Sourcery 交互
自定义你的体验访问你的 dashboard 以:
获取帮助Original review guide in EnglishReviewer's GuideAdds proactive message image generation support via a tool-loop agent, including an image capture event, configurable auto/always modes, tool prewarm & caching, and integration into the proactive sender pipeline. Sequence diagram for proactive message image generation and capturesequenceDiagram
participant ProactiveChatPlugin
participant ImageMixin
participant Context
participant ToolManager
participant ToolLoopAgent as tool_loop_agent
participant CaptureEvent as _ImageCaptureEvent
participant ImageTool as ImagePlugin
ProactiveChatPlugin->>ImageMixin: _maybe_generate_proactive_images(session_id, text, session_config)
alt image_settings.mode is off
ImageMixin-->>ProactiveChatPlugin: []
else auto or always
ImageMixin->>ImageMixin: _run_image_agent(session_id, text, image_conf, mode)
ImageMixin->>Context: get_current_chat_provider_id(session_id)
Context-->>ImageMixin: provider_id
ImageMixin->>Context: get_llm_tool_manager()
Context-->>ImageMixin: ToolManager
ImageMixin->>ImageMixin: _ensure_image_tool_names(ToolManager, image_conf)
ImageMixin-->>ImageMixin: tool_names
ImageMixin->>Context: llm_generate(provider_id, prompt, system_prompt)
Context-->>ImageMixin: completion_text
ImageMixin->>ImageMixin: _generate_image_prompt(provider_id, text, image_conf, mode)
ImageMixin-->>ImageMixin: image_prompt
ImageMixin->>CaptureEvent: _ImageCaptureEvent(context, session, text, message_type)
ImageMixin->>Context: tool_loop_agent(CaptureEvent, provider_id, user_prompt, tools, system_prompt, max_steps)
Context->>ToolLoopAgent: tool_loop_agent(...)
ToolLoopAgent->>ImageTool: call_local_llm_tool(...)
ImageTool->>CaptureEvent: send(MessageChain[Image, Plain])
CaptureEvent->>CaptureEvent: captured_images.append(Image)
Context-->>ImageMixin: tool_loop_agent finished
ImageMixin-->>ProactiveChatPlugin: images
end
loop for each image
ProactiveChatPlugin->>ProactiveChatPlugin: _send_chain_with_hooks(session_id, [image])
end
ProactiveChatPlugin->>ProactiveChatPlugin: send text after images
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Code Review
This pull request introduces proactive message image generation capabilities to AstrBot, allowing the bot to automatically generate and attach images to proactive messages using available image generation tools. The changes include configuration schema updates, a new ImageMixin class to manage tool selection and agent execution, and integration into the message sending and plugin lifecycle. The code review feedback is highly constructive and identifies several robustness issues, such as handling different message types in the capture event, adding defensive checks for null tool managers and raw string LLM responses, removing redundant code, and preventing premature garbage collection of the background prewarm task.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| async def send(self, message: "MessageChain") -> None: | ||
| """拦截发送:仅收集图片组件,不真正发往平台。""" | ||
| try: | ||
| if message and getattr(message, "chain", None): | ||
| for comp in message.chain: | ||
| if isinstance(comp, Image): | ||
| self.captured_images.append(comp) | ||
| except Exception as e: # noqa: BLE001 - 拦截阶段不应影响主流程 | ||
| logger.debug(f"[主动消息] 捕获配图组件时出现异常喵: {e!r}") |
There was a problem hiding this comment.
在 _ImageCaptureEvent 的 send 方法中,当前代码仅通过 getattr(message, "chain", None) 来获取消息组件。然而,在 AstrBot 中,生图插件或其它插件在调用 event.send 时,可能会直接传入一个组件列表(list)或单个 Image 组件。如果传入的是 list,由于 Python 的 list 对象没有 chain 属性,该方法将无法捕获到任何图片,导致配图功能失效。
建议对 message 的类型进行更具防御性的判断,兼容 MessageChain、list/tuple 以及单个组件的情况。
| async def send(self, message: "MessageChain") -> None: | |
| """拦截发送:仅收集图片组件,不真正发往平台。""" | |
| try: | |
| if message and getattr(message, "chain", None): | |
| for comp in message.chain: | |
| if isinstance(comp, Image): | |
| self.captured_images.append(comp) | |
| except Exception as e: # noqa: BLE001 - 拦截阶段不应影响主流程 | |
| logger.debug(f"[主动消息] 捕获配图组件时出现异常喵: {e!r}") | |
| async def send(self, message: Any) -> None: | |
| """拦截发送:仅收集图片组件,不真正发往平台。""" | |
| try: | |
| if message: | |
| if hasattr(message, "chain") and message.chain is not None: | |
| comps = message.chain | |
| elif isinstance(message, (list, tuple)): | |
| comps = message | |
| else: | |
| comps = [message] | |
| for comp in comps: | |
| if isinstance(comp, Image): | |
| self.captured_images.append(comp) | |
| except Exception as e: # noqa: BLE001 - 拦截阶段不应影响主流程 | |
| logger.debug(f"[主动消息] 捕获配图组件时出现异常喵: {e!r}") |
| tool_manager = context.get_llm_tool_manager() | ||
| tool_names = self._ensure_image_tool_names(tool_manager, image_conf) | ||
| if not tool_names: | ||
| # 已选不到(或已永久回退),_ensure_image_tool_names 内已记日志。 | ||
| return [] |
There was a problem hiding this comment.
在 _run_image_agent 中,如果 context.get_llm_tool_manager() 返回了 None(例如在 LLM 工具管理器尚未初始化完成或加载失败时),而此时 self._image_tools_selected 已经为 True,那么 _ensure_image_tool_names 仍会返回缓存的工具名称列表。随后在重建 ToolSet 时,调用 tool_manager.get_func(name) 将会直接抛出 AttributeError: 'NoneType' object has no attribute 'get_func' 异常导致程序崩溃。
建议在获取 tool_manager 后立即进行 None 值校验,若为空则安全退出。
| tool_manager = context.get_llm_tool_manager() | |
| tool_names = self._ensure_image_tool_names(tool_manager, image_conf) | |
| if not tool_names: | |
| # 已选不到(或已永久回退),_ensure_image_tool_names 内已记日志。 | |
| return [] | |
| tool_manager = context.get_llm_tool_manager() | |
| if not tool_manager: | |
| logger.info("[主动消息] 未找到 LLM 工具管理器,跳过配图喵。") | |
| return [] | |
| tool_names = self._ensure_image_tool_names(tool_manager, image_conf) | |
| if not tool_names: | |
| # 已选不到(或已永久回退),_ensure_image_tool_names 内已记日志。 | |
| return [] |
| capture_event = _ImageCaptureEvent( | ||
| context=context, | ||
| session=session, | ||
| message=text, | ||
| message_type=session.message_type, | ||
| ) | ||
|
|
||
| # 第一步:由插件端先生成“画面描述”(生图提示词)。 | ||
| # auto 模式下若模型判断这条消息不适合配图,会返回空,此时直接跳过。 | ||
| image_prompt = await self._generate_image_prompt( | ||
| provider_id, text, image_conf, mode | ||
| ) | ||
| if not image_prompt: | ||
| logger.info("[主动消息] 未生成配图提示词(判断无需配图),跳过配图喵。") | ||
| return [] |
There was a problem hiding this comment.
在 _run_image_agent 方法中,capture_event 在第 363 行被首次实例化,但在其被使用之前,第 372 行调用了 _generate_image_prompt,如果该调用返回空则直接返回。若不为空,在第 379 行 capture_event 又被重新实例化并覆盖。这导致第 363 行的首次实例化完全是多余的死代码。
建议移除该冗余的实例化步骤,以提高代码的可读性和运行效率。
| capture_event = _ImageCaptureEvent( | |
| context=context, | |
| session=session, | |
| message=text, | |
| message_type=session.message_type, | |
| ) | |
| # 第一步:由插件端先生成“画面描述”(生图提示词)。 | |
| # auto 模式下若模型判断这条消息不适合配图,会返回空,此时直接跳过。 | |
| image_prompt = await self._generate_image_prompt( | |
| provider_id, text, image_conf, mode | |
| ) | |
| if not image_prompt: | |
| logger.info("[主动消息] 未生成配图提示词(判断无需配图),跳过配图喵。") | |
| return [] | |
| # 第一步:由插件端先生成“画面描述”(生图提示词)。 | |
| # auto 模式下若模型判断这条消息不适合配图,会返回空,此时直接跳过。 | |
| image_prompt = await self._generate_image_prompt( | |
| provider_id, text, image_conf, mode | |
| ) | |
| if not image_prompt: | |
| logger.info("[主动消息] 未生成配图提示词(判断无需配图),跳过配图喵。") | |
| return [] |
| prompt = (getattr(resp, "completion_text", "") or "").strip() | ||
| if not prompt: | ||
| return "" |
There was a problem hiding this comment.
在 _generate_image_prompt 中,self.context.llm_generate 的返回值 resp 在不同的 AstrBot 版本或不同的 Provider 适配器中,有可能会直接返回一个原始的 str 字符串,而非包含 completion_text 属性的响应对象。如果返回的是 str,直接使用 getattr(resp, "completion_text", "") 将会得到空字符串,导致提示词生成失败。
建议增加对 resp 是否为 str 类型的防御性检查。
| prompt = (getattr(resp, "completion_text", "") or "").strip() | |
| if not prompt: | |
| return "" | |
| if isinstance(resp, str): | |
| prompt = resp.strip() | |
| else: | |
| prompt = (getattr(resp, "completion_text", "") or "").strip() | |
| if not prompt: | |
| return "" |
| try: | ||
| asyncio.create_task(_deferred_prewarm_image_tools()) | ||
| except Exception as e: # noqa: BLE001 | ||
| logger.debug(f"[主动消息] 启动配图工具预热任务失败喵: {e!r}") |
There was a problem hiding this comment.
在 initialize 中,通过 asyncio.create_task(_deferred_prewarm_image_tools()) 启动了一个延迟预热配图工具的后台任务。由于该任务在启动后会 await asyncio.sleep(5) 挂起 5 秒,且当前代码没有保留对该 Task 对象的强引用,在 Python 的垃圾回收(GC)机制下,该未被引用的 Task 极有可能在执行完毕前被提前回收,导致预热任务静默失败。
建议使用插件中已有的 self._track_task 方法来登记该任务的引用,防止其被过早释放。
| try: | |
| asyncio.create_task(_deferred_prewarm_image_tools()) | |
| except Exception as e: # noqa: BLE001 | |
| logger.debug(f"[主动消息] 启动配图工具预热任务失败喵: {e!r}") | |
| try: | |
| task = asyncio.create_task(_deferred_prewarm_image_tools()) | |
| if hasattr(self, "_track_task"): | |
| self._track_task(task) | |
| except Exception as e: # noqa: BLE001 | |
| logger.debug(f"[主动消息] 启动配图工具预热任务失败喵: {e!r}") |
There was a problem hiding this comment.
Hey - 我发现了 1 个问题,并且给出了一些高层次的反馈:
- 在
_run_image_agent中,capture_event被构造了两次(分别在_generate_image_prompt前后),但实际上只使用了第二个实例;建议移除第一次的构造,以避免混淆和不必要的工作。 - 图像工具选择缓存标记(
_image_tools_selected、_image_tools_disabled、_image_tools_attempts)在多个方法中被修改;如果这些方法在同一个进程中可能被并发调用,建议添加一个小的锁,或者确保它们只会从单一任务访问,以避免微妙的竞态条件。
给 AI Agent 的提示词
Please address the comments from this code review:
## Overall Comments
- In `_run_image_agent`, `capture_event` is constructed twice (before and after `_generate_image_prompt`), but only the second instance is actually used; consider removing the first construction to avoid confusion and unnecessary work.
- The image-tool selection cache flags (`_image_tools_selected`, `_image_tools_disabled`, `_image_tools_attempts`) are manipulated in several methods; if these can be called concurrently per process, consider adding a small lock or ensuring they’re only accessed from a single task to avoid subtle race conditions.
## Individual Comments
### Comment 1
<location path="core/image_generator.py" line_range="363" />
<code_context>
+ logger.info("[主动消息] 已选定的生图工具当前不可用,跳过配图喵。")
+ return []
+
+ capture_event = _ImageCaptureEvent(
+ context=context,
+ session=session,
</code_context>
<issue_to_address>
**issue:** The `capture_event` instance is constructed twice, but the first one is never used.
In `_run_image_agent`, `capture_event` is created once before `_generate_image_prompt` and again immediately after. Since the first instance is never used, please instantiate `capture_event` only once after confirming `image_prompt` is non-empty.
</issue_to_address>帮我变得更有用!请在每条评论上点击 👍 或 👎,我会利用这些反馈来改进对你代码的审查。
Original comment in English
Hey - I've found 1 issue, and left some high level feedback:
- In
_run_image_agent,capture_eventis constructed twice (before and after_generate_image_prompt), but only the second instance is actually used; consider removing the first construction to avoid confusion and unnecessary work. - The image-tool selection cache flags (
_image_tools_selected,_image_tools_disabled,_image_tools_attempts) are manipulated in several methods; if these can be called concurrently per process, consider adding a small lock or ensuring they’re only accessed from a single task to avoid subtle race conditions.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `_run_image_agent`, `capture_event` is constructed twice (before and after `_generate_image_prompt`), but only the second instance is actually used; consider removing the first construction to avoid confusion and unnecessary work.
- The image-tool selection cache flags (`_image_tools_selected`, `_image_tools_disabled`, `_image_tools_attempts`) are manipulated in several methods; if these can be called concurrently per process, consider adding a small lock or ensuring they’re only accessed from a single task to avoid subtle race conditions.
## Individual Comments
### Comment 1
<location path="core/image_generator.py" line_range="363" />
<code_context>
+ logger.info("[主动消息] 已选定的生图工具当前不可用,跳过配图喵。")
+ return []
+
+ capture_event = _ImageCaptureEvent(
+ context=context,
+ session=session,
</code_context>
<issue_to_address>
**issue:** The `capture_event` instance is constructed twice, but the first one is never used.
In `_run_image_agent`, `capture_event` is created once before `_generate_image_prompt` and again immediately after. Since the first instance is never used, please instantiate `capture_event` only once after confirming `image_prompt` is non-empty.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Generated with Claude Code via Happy Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>
Generated with Claude Code via Happy Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering>
|
@gemini-code-assist @sourcery-ai 谢了,几条都看了。 send 那块改了,现在 MessageChain、list、单个组件都能接,免得不同生图插件写法不一样漏图。重复构造的 capture_event 删了,确实是手滑留下的。tool_manager 拿到 None 的判空加了。预热那个后台任务也用 _track_task 挂上引用了,不然 sleep 五秒被 GC 收了就白跑。 另外两条没动:llm_generate 宿主签名就是返回 LLMResponse,不会给个裸字符串,那个 str 兜底就没必要了;缓存标记加锁也不用,主动消息是 APScheduler 串行触发的,碰不到并发,顶多预热和第一次发送各扫一次工具,没影响。 |
|
感谢你的详细说明,处理得非常清晰。关于你提到的两点未采纳意见,我完全理解并支持你的判断:
你的处理逻辑很稳健,尤其是通过 |
|
这个和另外一个工具调用的共性很大。打算合在一起做。故本pr关闭 |
实现 #73:主动消息支持配图。私聊/群聊各加一个三态开关
image_settings.mode:off(默认):只发文本。auto:让模型判断这条消息适不适合配图,适合才生成。always:尽量每条都配图。不绑定特定生图插件——装了哪个生图插件(向主 LLM 注册了工具的)都能用,没装就只发文本。
📝 描述 / Description
在主动消息发送流程里接入配图能力,并解决了几个实际问题:图片由本插件统一发送(不让生图插件绕过分段等处理)、先图后文、自动识别生图工具、提示词由插件端生成。
🛠️ 改动点 / Modifications
新增
core/image_generator.py,另外动了_conf_schema.json、message_sender.py、plugin_lifecycle.py、main.py。核心几点:图回到本插件自己发:用一个「捕获型」合成事件(继承
AstrMessageEvent,参考官方CronMessageEvent),重写send()把生图插件吐出来的图拦到缓冲区,不让它直接发平台。这样图能走本插件自己的发送流程(分段、装饰钩子、平台历史都正常)。先图后文:生图和发图放在文本之前,修掉「文本先到、图掉队十几秒」的问题。
自动识别生图工具:按关键词双线匹配工具名 + 描述(draw/image/生图/绘图 等)自动挑生图工具,命中「识别/视频」这种负向词时要有强生图信号才保留,避免误收
read_image_file、视频生成那类。识别不到的可以用extra_tools手动补、误判的用exclude_tools排。提示词插件端生成:先让 LLM 把消息文本转成一段画面描述(auto 模式下还会判断要不要配图),再把这段描述交给 Agent 去调生图工具,而不是让模型在工具循环里自己临场编。
预选 + 缓存:插件加载后预热探测一次并缓存,发送时直接复用、不每次扫描;连续 3 次找不到任何生图工具就本次运行内回退不生图。
📸 运行截图或测试结果 / Screenshots or Test Results
核心流程已真机测试、能正常配图出图:三态开关、图回到本插件自己发、先图后文、自动识别生图工具、缓存与退避都验证过。
第 4 点「提示词插件端生成」是最后才加的,尚未真机验证,只在本地用真实 AstrBot 组件验过逻辑(always 出描述、auto 不适合返空、LLM 失败兜底)。
本地用真实 AstrBot 组件验证过的点:
event.send([Image, Plain])时只截图片不截文本,0 次泄漏到平台。call_local_llm_tool把捕获事件喂给模拟生图 handler,确认图被截回缓冲区。read_image_file/视频/无关工具被正确排除。compileall通过,ruff format / check 通过。✅ 检查清单 / Checklist
❤️ CONTRIBUTING