Skip to content

AshtonVaughan/somnus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Somnus

Local-first reverse engineering orchestrator. Drives Ghidra, angr, Frida, rizin, QEMU, AFL++ and pwntools through a small-model LLM loop to triage binaries, identify likely bugs, and persist structured findings.

No API keys. No network calls. Everything runs on your machine.

What it does

Given a binary target, Somnus:

  1. Runs fast triage (protections, imports, dangerous calls, function inventory).
  2. Decompiles every function with Ghidra (cached per target).
  3. Pattern-matches the decompiled C for classic bug signatures (stack BOF, format string, command injection, integer overflow).
  4. Lets a local LLM reason over the compacted preview and call follow-up tools (zoom into one function, check reachability, propose a PoC shape).
  5. Persists findings and artifacts to SQLite — resumable, queryable.

Status

Working prototype. Verified end-to-end on ROP Emporium ret2win (x86_64): correctly identifies the read() overflow in pwnme, the ret2win gadget address, and computes the 40-byte overflow offset. Generalization beyond simple CTF stack-BOF challenges is not yet tested.

Requirements

  • Python 3.11+
  • Ollama for the local LLM (or any OpenAI-compatible server: llama.cpp, vLLM, LM Studio)
  • A model with tool-use support (qwen3:8b is the recommended default)
  • Ghidra 11+ with JDK 21 (required for decompilation)
  • Optional: rizin on PATH (faster triage), QEMU (crash verification), AFL++ (fuzzing)

Install

git clone https://github.com/AshtonVaughan/somnus.git
cd somnus
python -m venv .venv
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate
pip install -e ".[tui]"

Pull a local model:

ollama pull qwen3:8b

Point Somnus at Ghidra (once, permanent):

# Windows PowerShell
[System.Environment]::SetEnvironmentVariable("GHIDRA_INSTALL_DIR", "C:\tools\ghidra_12.0.4_PUBLIC", "User")

# macOS/Linux
export GHIDRA_INSTALL_DIR="$HOME/tools/ghidra_12.0.4_PUBLIC"

Usage

# check which adapters are available
somnus adapters

# binary metadata
somnus info targets/mybinary

# run the full agent loop (CLI, streaming progress)
somnus hunt targets/mybinary --goal "Triage and identify any memory-safety bugs."

# run with the live TUI (three panes: transcript, findings, tool output)
somnus tui targets/mybinary

# list persisted findings
somnus findings targets/mybinary

Selecting a model

# Ollama (default)
somnus hunt targets/bin --model qwen3:8b
somnus hunt targets/bin --model qwen3-coder:30b-a3b   # if you have the VRAM

# llama.cpp / vLLM / LM Studio
somnus hunt targets/bin --provider openai \
                        --host http://localhost:8080/v1 \
                        --model my-local-model

Environment variables (see .env.example):

Variable Default Purpose
SOMNUS_LLM ollama Backend: ollama or openai
SOMNUS_OLLAMA_MODEL qwen3:8b Ollama model tag
OLLAMA_HOST http://localhost:11434 Ollama daemon URL
SOMNUS_OPENAI_BASE http://localhost:8080/v1 OpenAI-compat server URL
SOMNUS_THINK off Set to 1 to enable Qwen3 thinking mode
GHIDRA_INSTALL_DIR unset Path to your Ghidra install

Architecture

   target -> Triage (rizin, pwntools)
          -> Decompile (Ghidra headless, cached)
          -> Pattern matcher emits SUSPECTED findings
          -> Agent loop:
               LLM picks next tool
               adapter runs, normalizes output
               finding store updated
               repeat until budget or goal met
          -> Verifier (angr reachability / crash PoC)
          -> Report

See docs/architecture.md for full diagram and design principles.

Adapters

Adapter Tool Status
rizin rizin / radare2 via r2pipe triage
pwntools pwntools ELF protections + GOT/PLT
ghidra Ghidra headless + post-script decompilation + pattern matcher
angr angr CFG, symbolic reachability
qemu QEMU user-mode crash PoC verification
frida Frida runtime hooks
aflpp AFL++ coverage-guided fuzzing

Every adapter returns (Artifact[], Finding[]). The LLM never sees raw tool output — only the normalized preview built by somnus/agent/preview.py.

Model selection

Local tool-use reliability is the bottleneck. Recommendations for a 2026 machine:

VRAM / unified memory Model
6-8 GB gemma4:e4b (4B, agent-trained)
8-12 GB qwen3:8b (default)
16-24 GB qwen3-coder:30b-a3b (MoE, 3B active, strong code reasoning)
48 GB+ qwen3-coder:next (80B-A3B)

Sub-4B models emit malformed tool-call JSON and hallucinate function names; don't bother.

Limits (be realistic)

  • Verified on one textbook CTF binary. Novel-bug discovery on real codebases is unproven.
  • Pattern matcher catches gets/fgets/strcpy/sprintf/read/printf(arg)/system(arg). No UAF, no integer overflow, no logic bugs.
  • No runtime verification on Windows without WSL. Reachability via angr is the current gate.
  • Stripped binaries degrade function identification. Name-based priority lists (pwnme, ret2win, etc.) only fire with symbols present.
  • x86_64 is the primary target. ARM/MIPS work through Ghidra but address math and PoC shapes assume x86_64.

License

MIT. See LICENSE.

Contributing

Issues and PRs welcome. Keep adapters small and self-contained — the orchestrator only cares about Artifact / Finding.

About

Local-first reverse engineering orchestrator. Drives Ghidra, angr, Frida, rizin, QEMU, AFL++, and pwntools through a small-model LLM to triage binaries and find bugs.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages