Skip to content

vinodhalaharvi/provenance

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Provenance

An autonomous incident-response coordinator where every finding traces back to the specific arrow that produced it. Submission to the Find Evil! hackathon.

Built on weft for typed arrow composition and the SANS SIFT Workstation for the underlying forensics tooling.

The thesis

Existing AI-assisted DFIR tooling (Protocol SIFT and similar) gives an LLM agent a generic execute_shell_cmd plus a long system prompt and hopes it stays on the rails. The hackathon brief flags this as the source of the autonomy gap: hallucination, evidence spoliation risk, loops that don't terminate.

Provenance replaces that surface with a typed action space and a critic-mediated policy:

  • Typed arrows wrap each SIFT tool. Inputs and outputs are Go structs, not raw shell text. An agent cannot construct invalid command lines, scan outside the case directory, or run a Windows plugin on a Linux image — those misconfigurations are not expressible.
  • Multi-agent loop with specialist proposers (memory / disk / network) and an independent critic that scores proposals before any arrow runs. Specialists are stateless across iterations; the evidence bag is the only memory.
  • Deterministic coordinator in Go (no LLM in the loop body) enforces termination, audit-trail emission, and proposal selection.
  • Functional seams for every behavioral choice. Selection policy, critic invocation strategy, proposal collection, and termination conditions are all swappable closures.

Quick start

make all          # tidy, build, test
make test         # ~25 unit tests, no SIFT install required
make test-verbose # show every test by name
make cover        # coverage report
make run          # run the mcp-server binary

The tests require nothing beyond Go and this repository — every SIFT tool invocation is unit-tested via a fake runner.RunFunc against captured fixture output. The production binary (bin/mcp-server) expects the real SIFT tools (vol, yara, ...) on PATH and is intended to be deployed inside the SIFT Workstation VM.

Requires Go 1.23 or newer.

Layout

provenance/
├── runner/                 functional seam for subprocess execution
│   └── runner.go             RunFunc, RealRun, NewFake
├── sift/                   typed SIFT arrows wrapping forensics tools
│   ├── types.go              Process, Timeline, YaraReport, ...
│   ├── pslist.go             Volatility 3 pslist (Win/Linux dispatch)
│   ├── yara.go               YARA scan with case-root path validation
│   ├── triage.go             composed Par(memory, disk) -> synthesis arrow
│   └── fixtures/             captured tool outputs for tests
├── coord/                  the multi-agent coordinator
│   ├── types.go              EvidenceBag, Proposal, Verdict, enums
│   ├── predicate.go          DSL parser + field registry + evaluator
│   ├── seams.go              four behavioral seams + defaults
│   └── coordinator.go        the loop, exposed as a weft.Arrow
├── mcpadapter/             generic weft.Arrow -> MCP tool wrapper
│   └── register.go           one function: RegisterArrow[In, Out]
└── cmd/mcp-server/         production binary serving MCP over stdio

The functional seams (one principle, applied everywhere)

Every test/production boundary in this repo is a function type, not an interface. Function types cannot be type-asserted back to a concrete implementation — there is no concrete type behind the seam, only a closure. This makes the boundary opaque and prevents the class of bug where someone reaches through the seam to bypass it.

The seams:

Seam Type Where
Subprocess runner RunFunc runner/
Trace emission TraceTap coord/types.go
Proposal collection CollectProposalsFn coord/seams.go
Critic invocation InvokeCriticFn coord/seams.go
Selection policy SelectFn coord/seams.go
Termination check TerminationCondition coord/seams.go
Arrow execution ArrowExecutor coord/types.go

Each ships with a sensible default; each can be replaced field-by-field on the Coordinator struct before calling Run().

Architectural guardrails

These are read-only and integrity properties enforced before any subprocess is spawned or any LLM is called:

  1. Volatility plugin dispatch is profile-driven Go code. A MemoryImage.Profile starting with "Win" selects windows.pslist; anything else selects linux.pslist. No agent-controlled string could pick the wrong plugin.
  2. YARA target paths must resolve inside CaseRoot after cleaning and filepath.Rel normalization. ../../etc/passwd is rejected before exec.Command is called. CaseRoot must be absolute.
  3. Predicate DSL parses at proposal-receipt time. Malformed predicates and references to unknown fields are caught before the critic or the executor sees them.
  4. The coordinator loop is deterministic Go. Termination conditions, proposal validation, audit-trail emission — none of these run through an LLM, so none of them can be coaxed off-rail by a prompt-injection.

The audit trail (where the name comes from)

Every ExecutionStep records the proposing specialist, the arrow that ran, the rationale and predicate that were supplied, which bag fields the step populated, and timing data. Findings carry an Evidence slice naming the arrows that contributed their supporting data. A judge running a demo can trace any "high severity" claim through findings -> arrows -> trace entries -> raw tool invocations.

What's wired vs. what's stubbed

Layer State
Typed arrows (sift/) Real
Subprocess seam Real
MCP tool exposure Real
Coordinator loop Real
Predicate DSL Real
Termination conditions Real
Specialist agents Stubs
Critic agent Stub
Arrow executor Stub

The stubs are deterministic Go that exercise every code path in the loop. Phase 2 of the build replaces them with LLM-driven specialists and critic, plus a real executor that dispatches ArrowName to the concrete sift.* arrows.

Production roadmap

For the hackathon, this is plain Go. For the long-term shape:

  • Temporal would handle durable execution, retry-aware activities, workflow replay as native audit trail, and signals/queries for human-in-the-loop interaction. The coordinator's ArrowExecutor seam is the natural lift point — each arrow execution becomes a Temporal activity, the coordinator loop becomes the workflow body.
  • Learned coordinator policy. The proposal/verdict/outcome triples produced by every run are training data. A future version replaces the heuristic critic with a model trained on the trace corpus to predict the next-best arrow given a bag state.
  • Cross-case calibration. Specialists carry forward calibration scores between cases — agents whose expectations match reality more often get their proposals weighted higher in selection.

License

MIT. See LICENSE.

About

Find Evil! hackathon submission: typed multi-agent IR coordinator on weft

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors