Skip to content

Commit c4b42c1

Browse files
authored
Merge pull request #2 from renbytes/fn-2
feat: GitHunter toolset and forensic trace system
2 parents 56b5e77 + 8c73c01 commit c4b42c1

54 files changed

Lines changed: 4186 additions & 9 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.flow/epics/fn-2.json

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
{
2+
"branch_name": "fn-2",
3+
"created_at": "2026-01-24T16:19:15.152423Z",
4+
"depends_on_epics": [],
5+
"id": "fn-2",
6+
"next_task": 1,
7+
"plan_review_status": "unknown",
8+
"plan_reviewed_at": null,
9+
"spec_path": ".flow/specs/fn-2.md",
10+
"status": "open",
11+
"title": "GitHunter Toolset Completion",
12+
"updated_at": "2026-01-24T16:19:49.637145Z"
13+
}

.flow/epics/fn-3.json

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
{
2+
"branch_name": "fn-3",
3+
"created_at": "2026-01-24T16:21:18.066966Z",
4+
"depends_on_epics": [],
5+
"id": "fn-3",
6+
"next_task": 1,
7+
"plan_review_status": "unknown",
8+
"plan_reviewed_at": null,
9+
"spec_path": ".flow/specs/fn-3.md",
10+
"status": "open",
11+
"title": "Forensic Features: Trace Persistence and Replay",
12+
"updated_at": "2026-01-24T16:22:03.691848Z"
13+
}

.flow/specs/fn-2.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,87 @@
1+
# GitHunter Toolset Completion
2+
3+
## Overview
4+
5+
Complete the GitHunter toolset by adding `tools.py` that wraps `GitHunterAdapter` methods as PydanticAI `Tool` objects. This follows the existing patterns established in `memory/tools.py` and `schema/tools.py`.
6+
7+
## Scope
8+
9+
### In Scope
10+
- Create request/response Pydantic models for tool inputs
11+
- Wrap 3 adapter methods as PydanticAI tools: `blame_line`, `find_pr_discussion`, `get_expert_for_file`
12+
- Export toolset as `githunter_toolset: list[Tool[GitHunterProtocol]]`
13+
- Add comprehensive tests following existing test patterns
14+
- Update `__init__.py` exports
15+
- Add documentation to API reference
16+
17+
### Out of Scope
18+
- GitLab/Bitbucket support (GitHub only)
19+
- `enrich_author` as a separate tool (internal use only)
20+
- Caching layer for repeated calls
21+
- New adapter functionality
22+
23+
## Approach
24+
25+
Follow the established toolset pattern:
26+
27+
1. **Request Models** (`_models.py`): Create Pydantic models for tool inputs
28+
- `BlameLineRequest(repo_path: str, file_path: str, line_no: int)`
29+
- `FindPRDiscussionRequest(repo_path: str, commit_hash: str)`
30+
- `GetExpertsRequest(repo_path: str, file_path: str, window_days: int = 90, limit: int = 3)`
31+
32+
2. **Tool Functions** (`tools.py`): Async functions with `RunContext[GitHunterProtocol]`
33+
- Convert string repo_path to Path internally
34+
- Catch adapter exceptions and convert to Error model returns
35+
- Include "Agent Usage" docstrings
36+
37+
3. **Toolset Export**: `githunter_toolset: list[Tool[GitHunterProtocol]]`
38+
39+
## Key Files
40+
41+
| File | Purpose |
42+
|------|---------|
43+
| `src/bond/tools/githunter/_models.py` | NEW - Request models |
44+
| `src/bond/tools/githunter/tools.py` | NEW - Tool functions + toolset export |
45+
| `src/bond/tools/githunter/__init__.py` | Update exports |
46+
| `tests/unit/tools/githunter/test_tools.py` | NEW - Tool tests |
47+
| `docs/api/tools.md` | Update API docs |
48+
49+
## Reuse Points
50+
51+
- **Pattern**: `src/bond/tools/memory/tools.py` (lines 45-144) - Tool function structure
52+
- **Pattern**: `src/bond/tools/schema/tools.py` - Simpler toolset example
53+
- **Error model**: `src/bond/tools/memory/_models.py:177-187` - Error return type
54+
- **Test pattern**: `tests/unit/tools/schema/test_tools.py` - Mock protocol pattern
55+
- **Protocol**: `src/bond/tools/githunter/_protocols.py` - Already complete
56+
- **Adapter**: `src/bond/tools/githunter/_adapter.py` - Already complete
57+
58+
## Quick Commands
59+
60+
```bash
61+
# Run GitHunter tests
62+
uv run pytest tests/unit/tools/githunter/ -v
63+
64+
# Type check
65+
uv run mypy src/bond/tools/githunter/
66+
67+
# Lint
68+
uv run ruff check src/bond/tools/githunter/
69+
```
70+
71+
## Acceptance
72+
73+
- [ ] `_models.py` contains 3 request models with validation
74+
- [ ] `tools.py` exports `githunter_toolset` with 3 tools
75+
- [ ] All tools handle adapter exceptions gracefully (return Error, don't raise)
76+
- [ ] Tests pass with MockGitHunter protocol implementation
77+
- [ ] `mypy` and `ruff` pass without errors
78+
- [ ] API docs updated with GitHunter toolset section
79+
- [ ] Exports available: `from bond.tools.githunter import githunter_toolset`
80+
81+
## References
82+
83+
- Memory toolset pattern: `src/bond/tools/memory/tools.py`
84+
- Schema toolset pattern: `src/bond/tools/schema/tools.py`
85+
- GitHunter protocol: `src/bond/tools/githunter/_protocols.py:14-91`
86+
- GitHunter adapter: `src/bond/tools/githunter/_adapter.py`
87+
- PydanticAI Tool docs: https://ai.pydantic.dev/tools/

.flow/specs/fn-3.md

Lines changed: 174 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,174 @@
1+
# Forensic Features: Trace Persistence and Replay
2+
3+
## Overview
4+
5+
Extend Bond's "Forensic Runtime" capabilities beyond real-time streaming to include trace persistence and replay. This enables:
6+
- **Audit**: Review what an agent did hours/days ago
7+
- **Debug**: Replay failed runs step-by-step
8+
- **Compare**: Analyze different executions side-by-side
9+
10+
## Scope
11+
12+
### In Scope
13+
- **Trace Capture**: Record all 8 StreamHandlers callback events with metadata
14+
- **Storage Backend**: Pluggable backend interface with JSON file implementation
15+
- **Replay API**: SDK method to iterate through stored events
16+
- **Handler Factory**: `create_capture_handlers()` for easy capture setup
17+
18+
### Out of Scope (Future)
19+
- Protobuf serialization (start with JSON for debugging)
20+
- Remote storage backends (S3, database)
21+
- Cross-trace querying and analytics
22+
- Real-time trace streaming to external systems
23+
- Automatic cleanup/retention policies
24+
- UI replay interface (API only in this phase)
25+
26+
## Approach
27+
28+
### Phase 1: Event Model
29+
30+
Define a unified event structure that normalizes all 8 callback types:
31+
32+
```python
33+
@dataclass(frozen=True)
34+
class TraceEvent:
35+
trace_id: str # UUID for this trace
36+
sequence: int # Ordering within trace
37+
timestamp: float # time.monotonic() for ordering
38+
wall_time: datetime # Human-readable timestamp
39+
event_type: str # "block_start", "text_delta", etc.
40+
payload: dict[str, Any] # Event-specific data
41+
```
42+
43+
Event types map to StreamHandlers:
44+
| Callback | event_type | payload keys |
45+
|----------|------------|--------------|
46+
| on_block_start | "block_start" | kind, index |
47+
| on_block_end | "block_end" | kind, index |
48+
| on_text_delta | "text_delta" | text |
49+
| on_thinking_delta | "thinking_delta" | text |
50+
| on_tool_call_delta | "tool_call_delta" | name, args |
51+
| on_tool_execute | "tool_execute" | id, name, args |
52+
| on_tool_result | "tool_result" | id, name, result |
53+
| on_complete | "complete" | data |
54+
55+
### Phase 2: Storage Backend Protocol
56+
57+
```python
58+
@runtime_checkable
59+
class TraceStorageProtocol(Protocol):
60+
async def save_event(self, event: TraceEvent) -> None:
61+
"""Append event to trace."""
62+
...
63+
64+
async def finalize_trace(self, trace_id: str) -> None:
65+
"""Mark trace as complete."""
66+
...
67+
68+
async def load_trace(self, trace_id: str) -> AsyncIterator[TraceEvent]:
69+
"""Load events for replay."""
70+
...
71+
72+
async def list_traces(self, limit: int = 100) -> list[TraceMeta]:
73+
"""List available traces."""
74+
...
75+
```
76+
77+
Initial implementation: `JSONFileTraceStore` writing to `.bond/traces/{trace_id}.json`
78+
79+
### Phase 3: Capture Handler Factory
80+
81+
```python
82+
def create_capture_handlers(
83+
storage: TraceStorageProtocol,
84+
trace_id: str | None = None, # Auto-generate if None
85+
) -> tuple[StreamHandlers, str]:
86+
"""Create handlers that capture events to storage.
87+
88+
Returns:
89+
(handlers, trace_id) - handlers for agent.ask(), and trace ID for later replay
90+
"""
91+
```
92+
93+
### Phase 4: Replay API
94+
95+
```python
96+
class TraceReplayer:
97+
def __init__(self, storage: TraceStorageProtocol, trace_id: str):
98+
...
99+
100+
async def __aiter__(self) -> AsyncIterator[TraceEvent]:
101+
"""Iterate through all events."""
102+
...
103+
104+
async def step(self) -> TraceEvent | None:
105+
"""Get next event (for manual stepping)."""
106+
...
107+
108+
@property
109+
def current_position(self) -> int:
110+
"""Current event index."""
111+
...
112+
```
113+
114+
## Key Files
115+
116+
| File | Purpose |
117+
|------|---------|
118+
| `src/bond/trace/__init__.py` | NEW - Module exports |
119+
| `src/bond/trace/_models.py` | NEW - TraceEvent, TraceMeta models |
120+
| `src/bond/trace/_protocols.py` | NEW - TraceStorageProtocol |
121+
| `src/bond/trace/backends/json_file.py` | NEW - JSON file storage |
122+
| `src/bond/trace/capture.py` | NEW - create_capture_handlers |
123+
| `src/bond/trace/replay.py` | NEW - TraceReplayer class |
124+
| `src/bond/utils.py` | UPDATE - Add capture handler factory |
125+
| `tests/unit/trace/` | NEW - Test directory |
126+
127+
## Reuse Points
128+
129+
- **Event structure**: Inspired by `create_websocket_handlers()` JSON format (`src/bond/utils.py:34-86`)
130+
- **Protocol pattern**: Follow `src/bond/tools/memory/_protocols.py` style
131+
- **Storage pattern**: Similar to `AgentMemoryProtocol` but for events
132+
133+
## Quick Commands
134+
135+
```bash
136+
# Run trace tests
137+
uv run pytest tests/unit/trace/ -v
138+
139+
# Type check
140+
uv run mypy src/bond/trace/
141+
142+
# Example usage (after implementation)
143+
python -c "
144+
from bond.trace import JSONFileTraceStore, create_capture_handlers, TraceReplayer
145+
store = JSONFileTraceStore()
146+
handlers, trace_id = create_capture_handlers(store)
147+
print(f'Trace ID: {trace_id}')
148+
"
149+
```
150+
151+
## Acceptance
152+
153+
- [ ] `TraceEvent` model captures all 8 callback types
154+
- [ ] `TraceStorageProtocol` defines storage interface
155+
- [ ] `JSONFileTraceStore` implements protocol with file-based storage
156+
- [ ] `create_capture_handlers()` returns working StreamHandlers
157+
- [ ] `TraceReplayer` can iterate through stored traces
158+
- [ ] All tests pass with >80% coverage on trace module
159+
- [ ] `mypy` and `ruff` pass
160+
- [ ] Documentation added to architecture page
161+
162+
## Open Questions
163+
164+
1. **Trace directory**: Use `.bond/traces/` or configurable path?
165+
2. **Large tool results**: Truncate at what size? 1MB? 10MB?
166+
3. **Crash handling**: How to mark incomplete traces? Separate "status" field?
167+
4. **Event ordering**: Use monotonic clock + sequence number for guaranteed order?
168+
169+
## References
170+
171+
- WebSocket handler pattern: `src/bond/utils.py:20-118`
172+
- StreamHandlers dataclass: `src/bond/agent.py:28-73`
173+
- Event sourcing StoredEvent: https://eventsourcing.readthedocs.io/
174+
- OTel trace format: https://opentelemetry.io/docs/specs/semconv/gen-ai/

.flow/tasks/fn-2.1.json

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
{
2+
"assignee": "bordumbb@gmail.com",
3+
"claim_note": "",
4+
"claimed_at": "2026-01-24T16:43:07.860701Z",
5+
"created_at": "2026-01-24T16:19:55.586842Z",
6+
"depends_on": [],
7+
"epic": "fn-2",
8+
"evidence": {
9+
"commits": [
10+
"cdc4a2f"
11+
],
12+
"prs": [],
13+
"tests": []
14+
},
15+
"id": "fn-2.1",
16+
"priority": null,
17+
"spec_path": ".flow/tasks/fn-2.1.md",
18+
"status": "done",
19+
"title": "Create GitHunter request models",
20+
"updated_at": "2026-01-24T16:45:23.392733Z"
21+
}

.flow/tasks/fn-2.1.md

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# fn-2.1 Create GitHunter request models
2+
3+
## Description
4+
Create Pydantic request models for GitHunter tools in `src/bond/tools/githunter/_models.py`.
5+
6+
### Models to Create
7+
8+
```python
9+
class BlameLineRequest(BaseModel):
10+
"""Request for blame_line tool."""
11+
repo_path: str # String, converted to Path in tool
12+
file_path: str
13+
line_no: int = Field(ge=1, description="Line number (1-indexed)")
14+
15+
class FindPRDiscussionRequest(BaseModel):
16+
"""Request for find_pr_discussion tool."""
17+
repo_path: str
18+
commit_hash: str = Field(min_length=7, description="Full or abbreviated SHA")
19+
20+
class GetExpertsRequest(BaseModel):
21+
"""Request for get_expert_for_file tool."""
22+
repo_path: str
23+
file_path: str
24+
window_days: int = Field(default=90, ge=0, description="Days of history (0=all time)")
25+
limit: int = Field(default=3, ge=1, le=10, description="Max experts to return")
26+
```
27+
28+
### Also Add
29+
30+
- `Error` model following `memory/_models.py:177-187` pattern
31+
- Union types for return values: `BlameResult | Error`, etc.
32+
33+
### Reference Files
34+
35+
- Pattern: `src/bond/tools/memory/_models.py`
36+
- Types: `src/bond/tools/githunter/_types.py` (BlameResult, PRDiscussion, FileExpert)
37+
## Acceptance
38+
- [ ] `_models.py` exists with BlameLineRequest, FindPRDiscussionRequest, GetExpertsRequest
39+
- [ ] All models have Field validators (ge, min_length, etc.)
40+
- [ ] Error model exists for union return types
41+
- [ ] `mypy src/bond/tools/githunter/_models.py` passes
42+
- [ ] `ruff check src/bond/tools/githunter/_models.py` passes
43+
## Done summary
44+
Created _models.py with GitHunter request models:
45+
- BlameLineRequest (repo_path, file_path, line_no with ge=1 validator)
46+
- FindPRDiscussionRequest (repo_path, commit_hash with min_length=7 validator)
47+
- GetExpertsRequest (repo_path, file_path, window_days=90 default, limit=3 default)
48+
- Error model for union return types in tool responses
49+
50+
All models follow the Annotated[..., Field(...)] pattern from memory/_models.py.
51+
Passed mypy and ruff checks.
52+
## Evidence
53+
- Commits: cdc4a2f
54+
- Tests:
55+
- PRs:

.flow/tasks/fn-2.2.json

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
{
2+
"assignee": "bordumbb@gmail.com",
3+
"claim_note": "",
4+
"claimed_at": "2026-01-24T16:45:49.269695Z",
5+
"created_at": "2026-01-24T16:20:03.882583Z",
6+
"depends_on": [
7+
"fn-2.1"
8+
],
9+
"epic": "fn-2",
10+
"evidence": {
11+
"commits": [
12+
"2d87f6a"
13+
],
14+
"prs": [],
15+
"tests": []
16+
},
17+
"id": "fn-2.2",
18+
"priority": null,
19+
"spec_path": ".flow/tasks/fn-2.2.md",
20+
"status": "done",
21+
"title": "Implement GitHunter tool functions",
22+
"updated_at": "2026-01-24T16:46:36.650352Z"
23+
}

0 commit comments

Comments
 (0)