feat(runbooks): local runbook store with runbook-aware diagnosis grounding (#1073 phase 2a)#2029
Conversation
Greptile code reviewThis repo uses Greptile for automated review. Before merge, aim for Confidence Score: 5/5 with zero unresolved review threads — see CONTRIBUTING.md. Run a review — add a PR comment with: Give it ~5-10 minutes (sometimes longer) for results, then fix feedback and re-trigger until you reach Confidence Score: 5/5. Optional: automate with the greploop skill. |
Greptile SummaryThis PR adds a local markdown runbook store and runbook-aware reasoning to the investigation pipeline (Phase 2a of #1073). All previously flagged issues from earlier review rounds have been addressed in this revision.
Confidence Score: 5/5Safe to merge — all previously identified defects are resolved and the new runbook path is wrapped to never block an investigation. Every previously flagged issue has been corrected: the len(w) >= 3 keyword filter is consistent across pipeline, test suite, and docs; multi-word triggers are matched with the walrus-operator all() approach; commonLabels: null is handled by No files require special attention. Important Files Changed
Sequence DiagramsequenceDiagram
participant P as pipeline.py
participant RS as runbooks/store.py
participant RR as runbooks/retrieval.py
participant PR as prompt.py
participant RC as report_context.py
participant RF as report.py (Slack)
P->>RS: load_all() → list[Runbook]
P->>RR: retrieve_matching_runbook(runbooks, keywords, service, pipeline_name)
RR-->>P: "Runbook | None"
P->>P: matched.to_dict() → dict
Note over P: _merge(state, {matched_runbook: dict})
P->>PR: format_alert_context(state)
PR->>PR: _build_runbook_section(state[matched_runbook])
PR-->>P: alert context + runbook block
P->>RC: build_report_context(state)
RC->>RC: extract runbook_provenance from matched_runbook
RC-->>P: ReportContext with runbook_provenance
P->>RF: format_slack_message(ctx) / build_slack_blocks(ctx)
RF->>RF: "append _Source: runbooks/<slug>.md_ line/context block"
RF-->>P: Slack message with runbook provenance
Reviews (9): Last reviewed commit: "fix(runbooks): handle null commonLabels ..." | Re-trigger Greptile |
|
@greptile review |
|
@greptile review |
…lic RUNBOOK_DIR, fix docs
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
cf6595d to
6ab2a18
Compare
|
@greptile review |
…d double parse in save()
|
@greptile review |
VibhorGautam
left a comment
There was a problem hiding this comment.
this is a good shape overall - store, CLI, prompt section, and report provenance all tied together
two prompt-section nits:
_build_runbook_section cuts the body at 2000 chars, so it can split a sentence or code block. id look for the last newline before the cutoff and add a truncated marker so the agent doesnt treat a partial runbook as complete
_build_runbook_section falls back to unknown for slug, but report_context.py uses an empty string in provenance. small thing, but the report could look weird if a runbook is missing a slug
i couldnt tell from this diff where _retrieve_runbook decides which runbook matches. is that already in main, or coming in another PR? thats the part id want to sanity-check since the prompt asks the agent to prefer runbook actions
tmp_path + monkeypatch coverage looks right
Hi @VibhorGautam
|
…ard missing slug in prompt
|
ah missed that, thanks for the pointer - looks like the matching logic is solid then |
|
@greptile review |
…nstead of traceback
|
@greptile review |
|
good fix on the ValueError catch, clean and minimal greptile's right about the commonLabels null case too. worth scanning the rest of |
|
@greptile review |
|
hey @devankitjuneja thank you for the PR, we will need some sort of confirmation from @VaibhavUpreti and @davincios before going with the merge and approval and reviews. |
Sure :) |
|
Hi @VaibhavUpreti |
| @runbook.command("add") | ||
| @click.argument("path", type=click.Path(exists=True, dir_okay=False, path_type=Path)) | ||
| def runbook_add(path: Path) -> None: | ||
| """Copy a markdown runbook into ~/.config/opensre/runbooks/.""" |
There was a problem hiding this comment.
Let's update all the runbook DIR to .opensre/runbooks
VaibhavUpreti
left a comment
There was a problem hiding this comment.
@devankitjuneja great work so far, could you please add a demo video of a grafana alert by running opensre investigate -i <file_path>, before and after you added the runbook.
After loading integration the first step should be to load the runbook in the planning step.
Sure @VaibhavUpreti |
Relates to #1073
Describe the changes you have made in this PR -
Adds a local markdown runbook store and runbook-aware reasoning to the investigation pipeline (Phase 2a).
What's included:
Runbook format:
Demo/Screenshot for feature changes and bug fixes -
Code Understanding and AI Usage
Did you use AI assistance (ChatGPT, Claude, Copilot, etc.) to write any part of this code?
If you used AI assistance:
Explain your implementation approach:
Retrieval is deterministic (no LLM, no vector DB) — score = service name match + keyword overlap against `triggers:` frontmatter. Top-1 result is written to state before the ReAct loop so `format_alert_context()` can append the runbook body to the user message. Injection is in the alert context (not system prompt) because `build_system_prompt` is stateless. Template fallback from Phase 1 stays active when no runbook matches.
Checklist before requesting a review