Skip to content

Commit 246ab60

Browse files
RoyLinRoyLin
authored andcommitted
fix(docs): correct AgenticParse/AgenticSearch plugin API and format support
- Remove broken documentParserRegistry option from Node.js plugin examples (class instances couldn't be passed as plugins due to napi-rs type mismatch) - Replace PDF/Word/Excel claims with accurate plain-text format list - Document that binary format support requires Rust DocumentParser impl - Update agentic-search example 6 to use correct new AgenticSearch() syntax - Update code submodule to 5ccbff5 (plugin fix)
1 parent 36305b5 commit 246ab60

6 files changed

Lines changed: 43 additions & 116 deletions

File tree

apps/docs/content/docs/cn/code/plugins.mdx

Lines changed: 9 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -89,34 +89,20 @@ build_session(opts)
8989

9090
如果某个插件的 `load()` 调用失败(例如 `AgenticParse` 没有 LLM 客户端),只有该插件会被跳过,其他插件正常继续加载。失败的插件**不会**注册其配套技能。
9191

92-
## 文档解析器支持
92+
## 支持的文件格式
9393

94-
对于二进制文件格式(PDF、Excel、Word),将 `DocumentParserRegistry` 作为插件选项传入。这会通知插件该会话已启用文档解析功能。
94+
`AgenticSearch``AgenticParse` 开箱即用地支持纯文本格式,无需额外配置:
9595

96-
<Tabs groupId="lang" items={['TypeScript', 'Python']}>
97-
<Tab value="TypeScript">
98-
```typescript
99-
import { AgenticSearch, AgenticParse, DocumentParserRegistry } from '@a3s-lab/code';
96+
| 类别 | 格式 |
97+
|------|------|
98+
| 源代码 | Rust、Python、TypeScript、JavaScript、Go、Java、C/C++ 等 |
99+
| 配置/数据 | JSON、TOML、YAML、HCL、CSV、TSV |
100+
| 文档 | Markdown、纯文本 |
100101

101-
const session = agent.session('.', {
102-
plugins: [
103-
new AgenticSearch({ documentParserRegistry: new DocumentParserRegistry() }),
104-
new AgenticParse({ documentParserRegistry: new DocumentParserRegistry() }),
105-
],
106-
});
107-
```
108-
</Tab>
109-
<Tab value="Python">
110-
```python
111-
# Python 中,插件自动使用已注册的文档解析器
112-
# (自定义解析器通过 DocumentParser trait 在 Rust 中注册)
113-
opts.plugins = [AgenticSearch(), AgenticParse()]
114-
```
115-
</Tab>
116-
</Tabs>
102+
二进制格式(PDF、Word、Excel)需要在 Rust 中通过 `DocumentParser` trait 自定义实现。该接口当前未在 TypeScript 或 Python SDK 中暴露。
117103

118104
<Callout type="info">
119-
SDK 中的 `DocumentParserRegistry` 是一个信号 — 它启用二进制格式解析路径。自定义解析器(PDF、Excel、DOCX)通过实现 `DocumentParser` trait 并调用 `DocumentParserRegistry::register()` 在 Rust 中注册
105+
Node.js SDK 中的 `DocumentParserRegistry` 类保留供未来使用,当前在运行时不起任何作用。直接传入 `new AgenticSearch()``new AgenticParse()` 即可,无需注册表
120106
</Callout>
121107

122108
## 构建自定义插件

apps/docs/content/docs/cn/code/tools.mdx

Lines changed: 3 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -254,7 +254,7 @@ session = agent.session('.', opts)
254254

255255
### `agentic_parse`
256256

257-
LLM 增强的文档解析。通过 `DocumentParserRegistry` 解码二进制格式,然后进行结构化提取,并可选地执行 LLM 语义问答。
257+
LLM 增强的文档解析。进行结构化提取,并可选地执行 LLM 语义问答。
258258

259259
```json
260260
{
@@ -269,30 +269,9 @@ LLM 增强的文档解析。通过 `DocumentParserRegistry` 解码二进制格
269269

270270
省略 `query` 可快速获取结构概览,无需 LLM 开销。
271271

272-
### 文档解析器支持
272+
### 支持的格式
273273

274-
对于二进制格式(PDF、Excel、Word),通过插件选项传入 `DocumentParserRegistry`
275-
276-
<Tabs groupId="lang" items={['TypeScript', 'Python']}>
277-
<Tab value="TypeScript">
278-
```typescript
279-
import { AgenticParse, DocumentParserRegistry } from '@a3s-lab/code';
280-
281-
const session = agent.session('.', {
282-
plugins: [
283-
new AgenticParse({ documentParserRegistry: new DocumentParserRegistry() }),
284-
],
285-
});
286-
```
287-
</Tab>
288-
<Tab value="Python">
289-
```python
290-
# DocumentParserRegistry 是启用二进制格式支持的信号。
291-
# 自定义解析器(PDF、Excel)在 Rust 中通过 DocumentParser trait 注册。
292-
opts.plugins = [AgenticParse()]
293-
```
294-
</Tab>
295-
</Tabs>
274+
`agentic_parse` 开箱即用地支持纯文本格式:源代码、Markdown、JSON、TOML、YAML、HCL、CSV 等。二进制格式(PDF、Word、Excel)需要在 Rust 中通过 `DocumentParser` trait 自定义实现,当前未在 TypeScript 或 Python SDK 中暴露。
296275

297276
## 直接工具调用
298277

apps/docs/content/docs/en/code/examples/agentic-search.mdx

Lines changed: 9 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -456,26 +456,23 @@ result = session.send("""
456456

457457
---
458458

459-
### Example 6: Document Parsing Configuration
459+
### Example 6: Supported File Formats
460460

461-
Enable parsing of binary documents (PDF, DOCX, etc.) by passing `DocumentParserRegistry` as a **plugin option**:
461+
`agentic_search` handles source code, Markdown, JSON, TOML, YAML, HCL, CSV, and other plain-text formats out of the box — no extra configuration needed. Just enable the plugin:
462462

463463
<Tabs groupId="lang" items={['TypeScript', 'Python', 'Rust']}>
464464
<Tab value="TypeScript">
465465
```typescript
466-
import { Agent, AgenticSearch, DocumentParserRegistry } from '@a3s-lab/code';
466+
import { Agent, AgenticSearch } from '@a3s-lab/code';
467467

468468
const agent = await Agent.create('agent.hcl');
469469
const session = agent.session('.', {
470-
plugins: [
471-
new AgenticSearch({ documentParserRegistry: new DocumentParserRegistry() }),
472-
],
470+
plugins: [new AgenticSearch()],
473471
builtinSkills: true,
474472
});
475473

476-
// Now agentic_search can parse PDF, DOCX, and other binary documents
477474
const result = await session.send(
478-
'Search for "privacy policy" in all documents including PDFs'
475+
'Search for "privacy policy" across all source files'
479476
);
480477
```
481478
</Tab>
@@ -485,41 +482,31 @@ from a3s_code import Agent, SessionOptions, AgenticSearch
485482

486483
agent = Agent.create("agent.hcl")
487484
opts = SessionOptions()
488-
# Python plugins automatically use registered document parsers
489485
opts.plugins = [AgenticSearch()]
490486
opts.builtin_skills = True
491487

492488
session = agent.session(".", opts)
493-
494-
# Now agentic_search can parse PDF, DOCX, and other binary documents
495-
result = session.send(
496-
"Search for 'privacy policy' in all documents including PDFs"
497-
)
489+
result = session.send("Search for 'privacy policy' across all source files")
498490
```
499491
</Tab>
500492
<Tab value="Rust">
501493
```rust
502494
use a3s_code_core::{Agent, AgenticSearchPlugin, SessionOptions};
503-
use a3s_code_core::DocumentParserRegistry;
504-
use std::sync::Arc;
505495

506496
let agent = Agent::from_file("agent.hcl").await?;
507497

508498
let opts = SessionOptions::new()
509499
.with_plugin(AgenticSearchPlugin::new())
510-
.with_document_parser_registry(Arc::new(DocumentParserRegistry::new()))
511500
.with_builtin_skills(true);
512501

513502
let session = agent.session(".", opts).await?;
514-
515-
// Now agentic_search can parse PDF, DOCX, and other binary documents
516-
let result = session.send(
517-
"Search for 'privacy policy' in all documents including PDFs"
518-
).await?;
503+
let result = session.send("Search for 'privacy policy' across all source files").await?;
519504
```
520505
</Tab>
521506
</Tabs>
522507

508+
Binary formats (PDF, Word, Excel) require a custom `DocumentParser` implementation in Rust via the `DocumentParser` trait. This is not currently exposed in the TypeScript or Python SDKs.
509+
523510
<Callout type="info">
524511
Without `DocumentParserRegistry`, agentic_search only reads plain text files. Document parsing is optional and configured per-plugin.
525512
</Callout>

apps/docs/content/docs/en/code/plugins.mdx

Lines changed: 18 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -89,34 +89,30 @@ This means plugin companion skills appear in the **first turn's system prompt**
8989

9090
If a plugin's `load()` call fails (e.g. `AgenticParse` without an LLM client), only that plugin is skipped. Other plugins continue loading normally. Failed plugins do **not** register their companion skills.
9191

92-
## Document Parser Support
92+
## Supported File Formats
9393

94-
For binary file formats (PDF, Excel, Word), pass a `DocumentParserRegistry` as a plugin option. This signals to the plugin that document parsing is enabled for that session.
94+
Both `AgenticSearch` and `AgenticParse` work with plain-text formats out of the box — no extra configuration needed:
9595

96-
<Tabs groupId="lang" items={['TypeScript', 'Python']}>
97-
<Tab value="TypeScript">
98-
```typescript
99-
import { AgenticSearch, AgenticParse, DocumentParserRegistry } from '@a3s-lab/code';
96+
| Category | Formats |
97+
|----------|---------|
98+
| Source code | Rust, Python, TypeScript, JavaScript, Go, Java, C/C++, and more |
99+
| Config / data | JSON, TOML, YAML, HCL, CSV, TSV |
100+
| Documents | Markdown, plain text |
100101

101-
const session = agent.session('.', {
102-
plugins: [
103-
new AgenticSearch({ documentParserRegistry: new DocumentParserRegistry() }),
104-
new AgenticParse({ documentParserRegistry: new DocumentParserRegistry() }),
105-
],
106-
});
107-
```
108-
</Tab>
109-
<Tab value="Python">
110-
```python
111-
# In Python, plugins automatically use any registered document parsers
112-
# (custom parsers are registered in Rust via the DocumentParser trait)
113-
opts.plugins = [AgenticSearch(), AgenticParse()]
102+
Binary formats (PDF, Word, Excel) require a custom `DocumentParser` implementation in Rust. The `DocumentParser` trait is:
103+
104+
```rust
105+
pub trait DocumentParser: Send + Sync {
106+
fn name(&self) -> &str;
107+
fn supported_extensions(&self) -> &[&str];
108+
fn parse(&self, path: &std::path::Path) -> Result<String>;
109+
}
114110
```
115-
</Tab>
116-
</Tabs>
111+
112+
Register a custom parser by calling `DocumentParserRegistry::register()` in Rust before building the session options. This is not currently exposed in the TypeScript or Python SDKs.
117113

118114
<Callout type="info">
119-
`DocumentParserRegistry` in the SDK is a signal — it enables the binary-format parsing path. Custom parsers (PDF, Excel, DOCX) are implemented in Rust by implementing the `DocumentParser` trait and registering with `DocumentParserRegistry::register()`.
115+
The `DocumentParserRegistry` class in the Node.js SDK is reserved for future use and has no effect at runtime. Pass `new AgenticSearch()` or `new AgenticParse()` directly — no registry needed.
120116
</Callout>
121117

122118
## Building a Custom Plugin

apps/docs/content/docs/en/code/tools.mdx

Lines changed: 3 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -254,7 +254,7 @@ Multi-phase semantic code search. Extracts keywords from natural language, searc
254254

255255
### `agentic_parse`
256256

257-
LLM-enhanced document parsing. Decodes binary formats via `DocumentParserRegistry`, then applies structural extraction and optionally an LLM pass for semantic QA.
257+
LLM-enhanced document parsing. Applies structural extraction and optionally an LLM pass for semantic QA.
258258

259259
```json
260260
{
@@ -269,30 +269,9 @@ LLM-enhanced document parsing. Decodes binary formats via `DocumentParserRegistr
269269

270270
Omit `query` for a fast structural overview without LLM cost.
271271

272-
### Document Parser Support
272+
### Supported Formats
273273

274-
For binary formats (PDF, Excel, Word), pass a `DocumentParserRegistry` as a plugin option:
275-
276-
<Tabs groupId="lang" items={['TypeScript', 'Python']}>
277-
<Tab value="TypeScript">
278-
```typescript
279-
import { AgenticParse, DocumentParserRegistry } from '@a3s-lab/code';
280-
281-
const session = agent.session('.', {
282-
plugins: [
283-
new AgenticParse({ documentParserRegistry: new DocumentParserRegistry() }),
284-
],
285-
});
286-
```
287-
</Tab>
288-
<Tab value="Python">
289-
```python
290-
# DocumentParserRegistry is a signal enabling binary format support.
291-
# Custom parsers (PDF, Excel) are registered in Rust via the DocumentParser trait.
292-
opts.plugins = [AgenticParse()]
293-
```
294-
</Tab>
295-
</Tabs>
274+
`agentic_parse` supports plain-text formats out of the box: source code, Markdown, JSON, TOML, YAML, HCL, CSV, and more. Binary formats (PDF, Word, Excel) require a custom `DocumentParser` implementation in Rust — not currently exposed in the TypeScript or Python SDKs.
296275

297276
## Direct Tool Execution
298277

0 commit comments

Comments
 (0)