Skip to content

Commit 84861d1

Browse files
author
Cody McCodePants
committed
Remove trace-aware LLM client helper
1 parent 760d4ee commit 84861d1

8 files changed

Lines changed: 10 additions & 119 deletions

File tree

docs/content/docs/build-your-agent/custom-agents.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -68,9 +68,9 @@ The reference LLM agent builds a compact prompt from topology size, symptom keys
6868

6969
Return structured fields rather than only natural language. Free-form explanations are useful for review, but scoring depends on verdict, fault type, and location fields.
7070

71-
For reproducibility, NetOpsBench saves a per-case runtime trace beside the raw scenario result. Agents that need private LLM message traces should use `context.trace.llm_client(...)` or a NetOpsBench-provided framework callback such as `context.trace.langchain_callback()`. The harness writes ATIF v1.7 `trajectory.atif.json` artifacts for Harbor-style inspection while keeping ground truth out of the agent trajectory; scoring details are linked separately through `traces/results.jsonl`. Use `netopsbench trace view` to sync trace-enabled runs into the local Harbor viewer cache, or `netopsbench trace view <run_id>` to ensure a specific saved run is available in the viewer.
71+
For reproducibility, NetOpsBench saves a per-case runtime trace beside the raw scenario result. The bundled reference agent captures private LLM and tool events by attaching `context.trace.langchain_callback()` to its LangChain-compatible runtime. Custom non-LangChain agents can use advanced manual recorder methods such as `context.trace.record_llm_request(...)` and `context.trace.record_llm_response(...)` when they need private model calls in the trace. The harness writes ATIF v1.7 `trajectory.atif.json` artifacts for Harbor-style inspection while keeping ground truth out of the agent trajectory; scoring details are linked separately through `traces/results.jsonl`. Use `netopsbench trace view` to sync trace-enabled runs into the local Harbor viewer cache, or `netopsbench trace view <run_id>` to ensure a specific saved run is available in the viewer.
7272

73-
Trace storage preserves visible agent-environment interactions with secret redaction and per-field size limits. NetOpsBench does not monkeypatch arbitrary LLM SDKs, so fully private model prompts and responses are captured only when the agent uses the trace-aware client or callback.
73+
Trace storage preserves visible agent-environment interactions with secret redaction and per-field size limits. NetOpsBench does not monkeypatch arbitrary LLM SDKs, so fully private model prompts and responses are captured only when the agent uses a supported framework callback or the manual recorder methods.
7474

7575
## Reference agent
7676

docs/content/docs/build-your-agent/python-api-guide.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ The `scenario_summaries[*].raw_result_path` fields point to raw JSON files for c
7171

7272
Agent traces are saved by default and can be disabled for a run with `trace=False` or by setting `NETOPSBENCH_TRACE=0`. Disabling trace prevents private runtime trace collection and sidecar artifact creation. Ground truth and score details are written to `traces/results.jsonl`, not into the agent trajectory.
7373

74-
NetOpsBench stores visible prompts, model messages, tool calls, and observations with secret redaction and per-field truncation. For private LLM message capture, custom agents should call models through `context.trace.llm_client(...)` or attach `context.trace.langchain_callback()` to LangChain-compatible runtimes. Set `NETOPSBENCH_TRACE_MAX_FIELD_CHARS` to tune truncation.
74+
NetOpsBench stores visible prompts, model messages, tool calls, and observations with secret redaction and per-field truncation. The bundled `MinimalDeepAgent` attaches `context.trace.langchain_callback()` to its LangChain-compatible runtime so private LLM messages and tool events flow into the same recorder. Non-LangChain agents can use the advanced manual recorder methods, such as `context.trace.record_llm_request(...)` and `context.trace.record_llm_response(...)`, when they need to capture private model calls. Set `NETOPSBENCH_TRACE_MAX_FIELD_CHARS` to tune truncation.
7575

7676
Open a completed run directly in the Harbor viewer:
7777

docs/content/docs/quickstart.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,7 +69,7 @@ Supported provider presets:
6969

7070
| `--vendor` | Model | Environment variable |
7171
|---|---|---|
72-
| `openai` | gpt-5.4 | `OPENAI_API_KEY` |
72+
| `openai` | gpt-5.5 | `OPENAI_API_KEY` |
7373
| `minimax` | MiniMax-M3 | `MINIMAX_API_KEY` |
7474
| `deepseek` | deepseek-v4-pro | `DEEPSEEK_API_KEY` |
7575
| `zhipu` | glm-5.1 | `ZHIPU_API_KEY` |

examples/agents/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ To switch provider and endpoint explicitly:
4747
```python
4848
agent = MinimalDeepAgent(
4949
vendor="openai",
50-
model="gpt-5.4",
50+
model="gpt-5.5",
5151
base_url="https://api.openai.com/v1",
5252
)
5353
```

examples/agents/minimal_deepagent/agent.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
- ``providers/minimax.py`` — MiniMax-M3
1010
- ``providers/glm.py`` — ZhipuAI GLM-5.1
1111
- ``providers/deepseek.py`` — DeepSeek deepseek-v4-pro (thinking mode disabled)
12-
- ``providers/openai.py`` — OpenAI-compatible endpoint (gpt-5.4)
12+
- ``providers/openai.py`` — OpenAI-compatible endpoint (gpt-5.5)
1313
1414
The shared output schema (``DiagnosisOutput``) lives in ``schema.py``.
1515
Shared runtime and result helpers live in ``providers/runtime.py`` and
@@ -20,7 +20,7 @@
2020
- ``minimax`` — MiniMax-M3 (default)
2121
- ``zhipu`` — GLM-5.1 (ZhipuAI)
2222
- ``deepseek`` — deepseek-v4-pro (DeepSeek)
23-
- ``openai`` — gpt-5.4 via OpenAI-compatible endpoint
23+
- ``openai`` — gpt-5.5 via OpenAI-compatible endpoint
2424
2525
Dependencies (install with ``pip install deepagents langchain-openai langchain-mcp-adapters``):
2626
- deepagents
@@ -69,7 +69,7 @@ class MinimalDeepAgent:
6969
agent = MinimalDeepAgent(vendor="minimax") # MiniMax-M3 (default)
7070
agent = MinimalDeepAgent(vendor="zhipu") # GLM-5.1
7171
agent = MinimalDeepAgent(vendor="deepseek") # deepseek-v4-pro
72-
agent = MinimalDeepAgent(vendor="openai") # gpt-5.4
72+
agent = MinimalDeepAgent(vendor="openai") # gpt-5.5
7373
7474
Explicit ``model``, ``base_url``, or ``api_key`` kwargs override the
7575
vendor preset.

netopsbench/agents/tracing.py

Lines changed: 1 addition & 72 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
import threading
77
import uuid
88
from datetime import UTC, datetime
9-
from typing import Any, cast
9+
from typing import Any
1010

1111
from netopsbench.agents._trace_utils import jsonable as _jsonable
1212

@@ -34,26 +34,6 @@ def disabled(cls) -> AgentTraceRecorder:
3434

3535
return cls(enabled=False)
3636

37-
def llm_client(
38-
self,
39-
provider: str = "openai",
40-
*,
41-
model: str,
42-
api_key: str | None = None,
43-
base_url: str | None = None,
44-
**client_kwargs: Any,
45-
) -> TraceAwareLLMClient:
46-
"""Return an OpenAI-compatible chat client that records visible messages."""
47-
48-
return TraceAwareLLMClient(
49-
recorder=self,
50-
provider=provider,
51-
model=model,
52-
api_key=api_key,
53-
base_url=base_url,
54-
client_kwargs=client_kwargs,
55-
)
56-
5737
def langchain_callback(self) -> Any | None:
5838
"""Return a LangChain callback handler that writes into this recorder."""
5939

@@ -371,57 +351,6 @@ def _accumulate_usage(self, usage: dict[str, int]) -> None:
371351
self._metrics["total_tokens"] += usage["total_tokens"]
372352
self._metrics["llm_call_count"] += usage["has_usage"] or 1
373353

374-
375-
class TraceAwareLLMClient:
376-
"""Small OpenAI-compatible chat client wrapper that records requests and responses."""
377-
378-
def __init__(
379-
self,
380-
*,
381-
recorder: AgentTraceRecorder,
382-
provider: str,
383-
model: str,
384-
api_key: str | None,
385-
base_url: str | None,
386-
client_kwargs: dict[str, Any],
387-
):
388-
self.recorder = recorder
389-
self.provider = provider
390-
self.model = model
391-
self.api_key = api_key
392-
self.base_url = base_url
393-
self.client_kwargs = dict(client_kwargs)
394-
395-
async def chat(self, messages: list[dict[str, Any]], **kwargs: Any) -> Any:
396-
try:
397-
from openai import AsyncOpenAI
398-
except Exception as exc: # pragma: no cover - depends on optional agent deps
399-
raise RuntimeError("openai is required for context.trace.llm_client().chat()") from exc
400-
401-
run_id = self.recorder.record_llm_request(
402-
messages,
403-
model=self.model,
404-
provider=self.provider,
405-
)
406-
client = AsyncOpenAI(api_key=self.api_key, base_url=self.base_url, **self.client_kwargs)
407-
try:
408-
response = await client.chat.completions.create(
409-
model=self.model,
410-
messages=cast(Any, messages),
411-
**kwargs,
412-
)
413-
except Exception as exc:
414-
self.recorder.record_error(stage="llm", error=exc, run_id=run_id)
415-
raise
416-
self.recorder.record_llm_response(
417-
response,
418-
run_id=run_id,
419-
model=self.model,
420-
provider=self.provider,
421-
)
422-
return response
423-
424-
425354
def _message_payload(message: Any, *, index: int) -> dict[str, Any]:
426355
payload: dict[str, Any] = {
427356
"index": index,

tests/test_example_agents.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -199,7 +199,7 @@ def test_minimal_deepagent_openai_defaults_and_openai_key(monkeypatch):
199199
agent = MinimalDeepAgent(vendor="openai")
200200

201201
assert agent.api_key == "openai-shell-key"
202-
assert agent.model == "gpt-5.4"
202+
assert agent.model == "gpt-5.5"
203203
assert agent.base_url == "https://api.openai.com/v1"
204204

205205

tests/test_session_tracing.py

Lines changed: 0 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -1,54 +1,16 @@
11
from __future__ import annotations
22

33
import json
4-
import sys
54
from datetime import UTC, datetime
65
from types import SimpleNamespace
76

8-
import pytest
97
from harbor.viewer.scanner import JobScanner
108

119
from netopsbench.agents.base import DiagnosticContext
1210
from netopsbench.agents.tracing import AgentTraceRecorder
1311
from netopsbench.platform.session.tracing import TraceWriter, export_traces, load_trace_index
1412

1513

16-
@pytest.mark.asyncio
17-
async def test_trace_aware_llm_client_records_private_messages(monkeypatch):
18-
captured = {}
19-
20-
class FakeCompletions:
21-
async def create(self, **kwargs):
22-
captured.update(kwargs)
23-
return SimpleNamespace(
24-
choices=[SimpleNamespace(message=SimpleNamespace(type="ai", content="diagnosis draft"))],
25-
usage=SimpleNamespace(prompt_tokens=7, completion_tokens=3, total_tokens=10),
26-
)
27-
28-
class FakeAsyncOpenAI:
29-
def __init__(self, **kwargs):
30-
captured["client"] = kwargs
31-
self.chat = SimpleNamespace(completions=FakeCompletions())
32-
33-
monkeypatch.setitem(sys.modules, "openai", SimpleNamespace(AsyncOpenAI=FakeAsyncOpenAI))
34-
recorder = AgentTraceRecorder()
35-
36-
response = await recorder.llm_client("openai", model="gpt-test", api_key="test-key").chat(
37-
[{"role": "user", "content": "diagnose"}],
38-
temperature=0,
39-
)
40-
41-
assert response.choices[0].message.content == "diagnosis draft"
42-
assert captured["model"] == "gpt-test"
43-
assert captured["messages"] == [{"role": "user", "content": "diagnose"}]
44-
assert recorder.metrics()["input_tokens"] == 7
45-
assert recorder.metrics()["output_tokens"] == 3
46-
steps = recorder.to_steps()
47-
assert [step["message"] for step in steps] == ["diagnosis draft"]
48-
assert steps[0]["duration_seconds"] is not None
49-
assert steps[0]["extra"]["llm_request"]["messages"][0]["content"] == "diagnose"
50-
51-
5214
def test_disabled_trace_recorder_preserves_api_without_collecting():
5315
recorder = AgentTraceRecorder.disabled()
5416
run_id = recorder.record_llm_request([{"role": "user", "content": "diagnose"}], model="gpt-test")

0 commit comments

Comments
 (0)