Skip to content

Commit b68b629

Browse files
committed
feat: bt setup install evals, either in the background or in the TUI chosen
1 parent 8d86d06 commit b68b629

5 files changed

Lines changed: 455 additions & 74 deletions

File tree

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Braintrust URL Formats
2+
3+
## App Links (Current Format)
4+
5+
### Experiments
6+
7+
`https://www.braintrust.dev/app/{org}/p/{project}/experiments/{experiment_name}?r={root_span_id}&s={span_id}`
8+
9+
### Datasets
10+
11+
`https://www.braintrust.dev/app/{org}/p/{project}/datasets/{dataset_name}?r={root_span_id}`
12+
13+
### Project Logs
14+
15+
`https://www.braintrust.dev/app/{org}/p/{project}/logs?r={root_span_id}&s={span_id}`
16+
17+
## Legacy Object URLs
18+
19+
`https://www.braintrust.dev/app/object?object_type=...&object_id=...&id=...`
20+
21+
## URL Parameters
22+
23+
| Parameter | Description |
24+
| --------- | --------------------------------------------------------- |
25+
| r | The root_span_id - identifies a trace |
26+
| s | The span_id - identifies a specific span within the trace |
27+
| id | Legacy parameter for root_span_id in object URLs |
28+
29+
## Notes
30+
31+
- The `r=` parameter is always the root_span_id
32+
- For logs and experiments, use `s=` to reference a specific span within a trace

skills/sdk-install/csharp.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -152,4 +152,4 @@ if (braintrust != null && activitySource != null)
152152

153153
The final assistant response must include the printed URL.
154154

155-
If the SDK-generated URL is not available, fall back to the generic MCP permalink workflow described in the agent task (Step 5).
155+
If the SDK-generated URL is not available, construct the permalink manually using the URL format documented in `braintrust-url-formats.md` as described in the agent task (Step 5).

skills/sdk-install/instrument-task.md

Lines changed: 18 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
## Hard Rules
44

5+
{RUN_MODE_CONTEXT}
6+
57
- **Only add Braintrust code.** Do not refactor or modify unrelated code.
68
- **Pin exact versions.** Never use `latest`.
79
- **Set the project name in code.** Do NOT configure project name via env vars.
@@ -55,7 +57,7 @@ If the language is not obvious from standard build/dependency files:
5557

5658
- infer it from concrete repo evidence (e.g., entrypoint file extensions, build scripts, framework config)
5759
- State the single strongest piece of evidence you used
58-
- If still ambiguous (polyglot/monorepo), ask the user which service/app to instrument
60+
- If still ambiguous (polyglot/monorepo), ask the user which service/app to instrument and wait for the response before proceeding
5961
- If the inferred language is not in the supported list, **abort the install**.
6062

6163
If none match, **abort installation**.
@@ -66,8 +68,8 @@ If none match, **abort installation**.
6668

6769
Read the install guide for the detected language from the local docs:
6870

69-
| Language | Local doc |
70-
| ---------- | --------------------------------------- |
71+
| Language | Local doc |
72+
| ---------- | --------------------------------- |
7173
| Java | `{SDK_INSTALL_DIR}/java.md` |
7274
| TypeScript | `{SDK_INSTALL_DIR}/typescript.md` |
7375
| Python | `{SDK_INSTALL_DIR}/python.md` |
@@ -90,34 +92,29 @@ Requirements:
9092
- Confirm no runtime errors.
9193
- Confirm the app still runs if `BRAINTRUST_API_KEY` is unset.
9294

93-
If you do not know how to run the app, ask the user.
95+
If you do not know how to run the app, ask the user and wait for the response before proceeding.
9496

9597
---
9698

9799
### 5. Verify in Braintrust (CRITICAL)
98100

99-
Using the Braintrust MCP (preferred):
101+
The permalink must be included in the final output. This confirms the full installation succeeded.
100102

101-
1. Query for the emitted logs/traces.
102-
2. Generate a **permalink to the data**.
103-
3. Print the permalink clearly.
103+
The project must be set in code during installation — do not guess the project name from context.
104104

105-
The permalink must be included in the final output.
106-
This confirms the full installation succeeded.
105+
**How to obtain the permalink:**
107106

108-
Notes:
107+
Most language SDKs print a direct URL to the emitted trace after the app runs. Capture that URL and print it.
109108

110-
- The agent must not "guess" the project from Braintrust UI. The project must be set in code during installation.
111-
- If a language SDK provides a deterministic URL to the emitted trace/log (e.g. a `/logs?r=<traceId>&s=<spanId>` link), it is acceptable to print that as the permalink, but it still must point to the specific emitted data.
109+
If the SDK does not print a URL, construct one manually using the URL format documented in `{SDK_INSTALL_DIR}/braintrust-url-formats.md`:
112110

113-
Minimal MCP workflow to generate a permalink (use this if the SDK does not provide a deterministic URL):
111+
```
112+
https://www.braintrust.dev/app/{org}/p/{project_name}/logs?r={root_span_id}
113+
```
114114

115-
1. Resolve the project ID using the project name that was configured in code:
116-
- Call `resolve_object` with `object_type="project_logs"` and `project_name=<your project name>`
117-
2. Find the newest emitted row in that project:
118-
- Call `sql_query` with `object_type="project_logs"`, `object_ids=[<project id>]`, and a time filter, e.g. `created > now() - interval 1 hour`, ordered by `created DESC`, `limit 1`
119-
3. Generate a permalink to that row:
120-
- Call `generate_permalink` with `object_type="project_logs"`, `object_id=<project id>`, `row_id=<row id from sql_query>`
115+
- `org`: your Braintrust organization slug
116+
- `project_name`: the project name set in code
117+
- `root_span_id`: the trace/span ID returned or logged by the SDK
121118

122119
---
123120

@@ -130,4 +127,4 @@ Summarize:
130127
- What logs/traces were emitted
131128
- The Braintrust permalink (required)
132129

133-
{WORKFLOW_CONTEXT}
130+
{WORKFLOW_CONTEXT}

0 commit comments

Comments
 (0)