This document describes each of SecFlow's five analyzer microservices: their purpose, Docker service names, real API endpoints, tools used, and output contracts.
Each analyzer is an independent Docker microservice. The Orchestrator never imports analyzer code — it always calls via HTTP over the secflow-net Docker bridge. All containers listen on port 5000 internally.
| Analyzer | Docker service | Host port | Request format |
|---|---|---|---|
| Malware | malware-analyzer |
5001 | multipart/form-data file |
| Steganography | steg-analyzer |
5002 | multipart/form-data file (async) |
| Reconnaissance | recon-analyzer |
5003 | JSON {"query": "..."} |
| Web Vulnerability | web-analyzer |
5005 | JSON {"url": "..."} |
| Macro / Office | macro-analyzer |
5006 | multipart/form-data file |
Each service returns its own native JSON. An adapter inside the Orchestrator (orchestrator/app/adapters/<name>_adapter.py) translates that into the SecFlow contract:
{
"analyzer": str, # "malware" | "steg" | "recon" | "web" | "macro"
"pass": int, # 1-indexed loop pass number
"input": str, # the exact value passed in
"findings": list[dict], # normalised finding objects
"risk_score": float, # aggregate risk for this pass, 0.0–10.0
"raw_output": str # full text output (AI reads this for IOC extraction)
}Each finding object:
{
"type": str, # finding type string (e.g. "malware_detection", "av_detection")
"detail": str, # human-readable description
"severity": str, # "info" | "low" | "medium" | "high" | "critical"
"evidence": str, # raw evidence — rendered intelligently in the HTML report
}Analyzer services must never crash the Orchestrator. The adapter wraps all HTTP calls in try/except and returns an error-shaped finding dict if the service is unreachable.
Source: backend/Malware-Analyzer/
Docker service: malware-analyzer
Host port: 5001 → container port 5000
Base image: eclipse-temurin:21-jdk-jammy (JDK 21 required for Ghidra JVM)
Adapter: orchestrator/app/adapters/malware_adapter.py
| Method | Route | Timeout | Purpose |
|---|---|---|---|
GET |
/api/malware-analyzer/health |
— | Health check |
POST |
/api/malware-analyzer/file-analysis |
60s | VirusTotal API v3 lookup |
POST |
/api/malware-analyzer/decompile |
180s | Ghidra decompile + objdump -d |
POST |
/api/malware-analyzer/ai-summary |
— | Gemini narrative (internal, not used by orchestrator) |
There is no bare
POST /api/malware-analyzer/route.
# Call 1 — VirusTotal threat intel
requests.post(f"{_MALWARE_BASE}/file-analysis", files={"file": open(path, "rb")}, timeout=60)
# Call 2 — Ghidra decompile (slow — JVM + full analysis)
requests.post(f"{_MALWARE_BASE}/decompile", files={"file": open(path, "rb")}, timeout=180)
# Merged before adapter:
raw = {"vt": <file-analysis resp>, "decompile": <decompile resp>}| Tool | Purpose |
|---|---|
pyghidra + Ghidra 12.0.1 |
Full decompilation, auto-analysis of all binary functions |
objdump -d |
Assembly-level disassembly |
| VirusTotal API v3 | 70+ AV engine detections, behavioral tags, file stats |
exe, dll, so, elf, bin, o, out — other extensions return HTTP 400.
VIRUSTOTAL_API_KEY— required for/file-analysisGEMINI_API_KEY— only needed for/ai-summaryand/diagram-generator
| Finding type | Severity | Description |
|---|---|---|
malware_detection |
critical/high/info | VT detection stats |
av_detection |
high/medium | Individual AV engine results |
malware_clean |
info | No VT detections |
decompile_result |
medium/info | Ghidra decompiled code |
suspicious_string |
high | URL/IP/C2 found in decompile |
Source: backend/Steg-Analyzer/
Docker service: steg-analyzer
Host port: 5002 → container port 5000
Adapter: orchestrator/app/adapters/steg_adapter.py
| Method | Route | Purpose |
|---|---|---|
POST |
/api/steg-analyzer/upload |
Submit file, returns {hash} |
GET |
/api/steg-analyzer/status/{hash} |
Poll analysis status |
GET |
/api/steg-analyzer/result/{hash} |
Fetch final results |
The steg analyzer is asynchronous — upload, then poll:
# Step 1 — upload
r = requests.post(f"{_STEG_BASE}/upload", files={"file": open(path, "rb")})
hash_ = r.json()["hash"]
# Step 2 — poll until done
while True:
r = requests.get(f"{_STEG_BASE}/status/{hash_}", timeout=10)
if r.json()["status"] == "done":
break
time.sleep(2)
# Step 3 — fetch results
r = requests.get(f"{_STEG_BASE}/result/{hash_}", timeout=30)- LSB analysis (pixel-level encoding detection)
binwalk— file carving and embedded file extractionzsteg(PNG) — steganography detectionsteghide(JPEG/BMP) — extractionExifTool— metadata inspectionoutguess,pngcheck,graphicsmagick
The Steg Analyzer runs with PostgreSQL (steg-postgres) + Redis (steg-redis) + RQ worker (steg-worker) for async job queuing.
Source: backend/Recon-Analyzer/src/
Docker service: recon-analyzer
Host port: 5003 → container port 5000
API prefix: /api/Recon-Analyzer (capital R and A — exact)
Adapter: orchestrator/app/adapters/recon_adapter.py
| Method | Route | Purpose | Input |
|---|---|---|---|
GET |
/api/Recon-Analyzer/health |
Health check | — |
POST |
/api/Recon-Analyzer/scan |
IP/domain threat intel | {"query": "ip_or_domain"} |
POST |
/api/Recon-Analyzer/footprint |
Email/phone/username OSINT | {"query": "email"} |
The request body key is
query, nottarget.
# IP or domain
requests.post(f"{_RECON_BASE}/scan", json={"query": ip_or_domain}, timeout=60)
# Email / phone / username OSINT (when AI chains from macro IOCs)
requests.post(f"{_RECON_BASE}/footprint", json={"query": email_or_username}, timeout=60)| Module | What it checks |
|---|---|
ipapi.py |
Country, ISP, ASN, city, timezone via ip-api.com |
talos.py |
Cisco Talos IP blocklist (local talos.txt, auto-downloaded) |
tor.py |
Tor exit node list (local tor.txt, auto-downloaded) |
tranco.py |
Tranco domain ranking (domains only) |
threatfox.py |
ThreatFox IOC lookup — malware family, confidence (domains only) |
xposedornot.py |
Email breach check (email footprint) |
phone.py |
NumVerify phone validation (phone footprint) |
username.py |
Sagemode multi-site username OSINT (username footprint) |
- Valid IPv4 regex → runs ipapi + talos + tor
- Valid domain regex → resolves IP, runs ipapi + talos + tor + tranco + threatfox
- Email regex → footprint: xposedornot breach check
- Phone regex → footprint: NumVerify
- Else → footprint: Sagemode username OSINT
NUMVERIFY_API_KEY— phone validationTHREATFOX_API_KEY— higher rate limitipAPI_KEY— ip-api.com Pro
Source: backend/Web-Analyzer/
Docker service: web-analyzer
Host port: 5005 → container port 5000
Adapter: orchestrator/app/adapters/web_adapter.py
| Method | Route | Input |
|---|---|---|
POST |
/api/web-analyzer/ |
JSON {"url": "https://..."} |
- HTTP response analysis (status code, headers, redirect chain)
- Security header audit (CSP, HSTS, X-Frame-Options, X-Content-Type-Options, etc.)
- Technology fingerprinting
- Basic vulnerability scanning
GEMINI_API_KEY— used internally for enhanced analysis (optional)
Source: backend/macro-analyzer/
Docker service: macro-analyzer
Host port: 5006 → container port 5000
Adapter: orchestrator/app/adapters/macro_adapter.py
| Method | Route | Purpose |
|---|---|---|
GET |
/api/macro-analyzer/health |
Health check |
POST |
/api/macro-analyzer/analyze |
Full VBA + VirusTotal analysis |
requests.post(f"{_MACRO_BASE}/analyze",
files={"file": (original_name, open(path, "rb"))},
timeout=60).doc, .docx, .xls, .xlsx, .xlsm, .xlsb, .ppt, .pptx, .pptm, .rtf, .docm
| Tool | Purpose |
|---|---|
oletools / olevba |
VBA macro extraction, indicator analysis, IOC extraction |
| VirusTotal API v3 | SHA-256 hash lookup → upload → poll for analysis results |
| Category | Severity | Meaning |
|---|---|---|
AutoExec |
critical | Macro runs automatically on open/close |
Suspicious |
high | Suspicious API calls (Shell, CreateObject, etc.) |
IOC |
high | Embedded URLs, IPs, file paths |
Hex String |
medium | Hex-encoded obfuscated content |
Base64 String |
medium | Base64-obfuscated content |
Dridex String |
critical | Dridex banking trojan string encoding |
| olevba risk_level | Risk score | Condition |
|---|---|---|
malicious |
9.5 (base) | AutoExec + Suspicious flags both present |
suspicious |
6.5 (base) | Suspicious or IOC or obfuscated |
macro_present |
3.0 | Macros found, no suspicious flags |
clean |
0.5 | No macros |
If VirusTotal confirms malicious hits, risk score is raised: 1+ detections → max(base, 7.0); 5+ → max(base, 9.5).
| Finding type | Description |
|---|---|
macro_malicious / macro_suspicious / macro_present |
Overall VBA verdict |
macro_indicator_autoexec |
AutoExec indicators |
macro_indicator_suspicious |
Suspicious API calls |
macro_indicator_ioc |
Extracted IOCs |
macro_ioc |
IOC chip list (enables AI to chain to recon/web) |
macro_source |
Full VBA source (collapsible in report) |
macro_xlm |
Excel 4 (XLM) deobfuscated macros |
malware_detection |
VirusTotal stats table |
av_detection |
Per-engine AV detection (up to 10) |
payload_downloaded |
Always shown when file was fetched from a URL |
Each service returns its own native JSON. An adapter inside the Orchestrator (orchestrator/app/adapters/<name>_adapter.py) translates that into the SecFlow contract:
{
"analyzer": str, # "malware" | "steg" | "recon" | "url" | "web"
"pass": int, # 1-indexed loop pass number
"input": str, # the exact value passed in
"findings": list[dict], # see per-analyzer finding format below
"risk_score": float, # aggregate risk for this pass, 0.0–10.0
"raw_output": str # concatenated raw tool output (for AI consumption)
}Analyzer services must never crash the Orchestrator. The adapter must wrap the HTTP call in try/except and return an error-shaped finding dict if the service is unreachable or returns a non-200 response.
Service: backend/malware-analyzer/ — POST http://malware-analyzer:5001/api/malware-analyzer/
Adapter: orchestrator/app/adapters/malware_adapter.py
Detect malicious characteristics in executables, PE binaries, and extracted binary payloads.
- File path to:
.exe,.dll,.bin,.elf, extracted payload from another analyzer pass
| Technique | Description |
|---|---|
| File hashing | Compute MD5, SHA1, SHA256 |
| YARA scanning | Match bundled YARA rule set |
| PE header analysis | Parse PE sections, imports, exports, timestamps |
| String extraction | Extract printable strings; flag suspicious patterns (URLs, IPs, registry keys, API names) |
| Entropy analysis | High entropy sections → possible packing/encryption |
| (Optional) VirusTotal | Hash lookup via VT API if key is configured |
{
"type": "signature_match" | "suspicious_string" | "pe_metadata" | "hash" | "entropy" | "error",
"detail": str, # human-readable description
"severity": "low" | "medium" | "high" | "critical",
"evidence": str # raw evidence snippet
}[
{ "type": "hash", "detail": "SHA256: abc123...", "severity": "info", "evidence": "" },
{ "type": "signature_match", "detail": "YARA rule: Trojan.GenericKDZ matched", "severity": "critical", "evidence": "offset 0x200" },
{ "type": "suspicious_string", "detail": "HTTP callout found", "severity": "high", "evidence": "http://192.168.1.100/beacon" }
]yara-python— YARA rule matchingpefile— PE binary parsinghashlib— File hashing (stdlib)strings(system) or regex — String extraction
Service: backend/steg-analyzer/ — POST http://steg-analyzer:5002/api/steg-analyzer/
Adapter: orchestrator/app/adapters/steg_adapter.py
Detect and extract hidden data embedded within image files using steganographic or watermarking techniques.
- File path to:
.png,.jpg,.jpeg,.bmp,.gif,.tiff
| Technique | Description |
|---|---|
| LSB analysis | Detect least-significant-bit encoding in pixel data |
| Metadata inspection | ExifTool — check for hidden data in EXIF/IPTC/XMP |
| Embedded file extraction | binwalk — detect and extract appended/embedded files |
| Tool-based detection | zsteg (PNG), stegdetect (JPEG), steghide (JPEG/BMP) |
| Strings scan | Run strings on the image binary, flag suspicious patterns |
{
"type": "embedded_file" | "lsb_data" | "metadata_anomaly" | "suspicious_string" | "error",
"detail": str,
"severity": "low" | "medium" | "high" | "critical",
"evidence": str,
"extracted_path": str | None # path to extracted file if applicable
}[
{ "type": "embedded_file", "detail": "binwalk found embedded PE binary", "severity": "critical", "evidence": "offset 0x8200", "extracted_path": "/tmp/secflow/extracted/steg_payload.exe" },
{ "type": "metadata_anomaly", "detail": "EXIF GPS data present", "severity": "low", "evidence": "GPS: 37.7749,-122.4194", "extracted_path": null }
]binwalk(system) — File carving, embedded file extractionzsteg(system/gem) — PNG steg detectionstegdetect(system) — JPEG steg detectionsteghide(system) — Steghide extractionpyexiftoolorexiftool(system) — Metadata inspectionPillow— Image loading and pixel-level analysis
Service: backend/recon-analyzer/ — POST http://recon-analyzer:5003/api/recon-analyzer/
Adapter: orchestrator/app/adapters/recon_adapter.py
Gather OSINT and infrastructure intelligence on IPs, domains, and hostnames.
- IP address string (e.g.,
"192.168.1.100") - Domain or hostname string (e.g.,
"evil.example.com")
| Technique | Description |
|---|---|
| WHOIS lookup | Registrant, registrar, creation/expiry dates |
| DNS records | A, AAAA, MX, NS, TXT, CNAME records |
| Reverse DNS | PTR record lookup |
| Port scanning | Top ports scan via nmap |
| Geolocation | Country, ASN, ISP |
| Threat intel | Shodan lookup (optional), AbuseIPDB (optional) |
| Certificate info | TLS cert subjects and SANs (for domains) |
{
"type": "whois" | "dns" | "port" | "geolocation" | "threat_intel" | "cert" | "error",
"detail": str,
"severity": "info" | "low" | "medium" | "high" | "critical",
"evidence": str
}[
{ "type": "port", "detail": "Open ports detected", "severity": "medium", "evidence": "22/tcp open ssh, 80/tcp open http, 443/tcp open https" },
{ "type": "threat_intel", "detail": "IP found in Shodan with malware tag", "severity": "critical", "evidence": "tags: malware, c2" },
{ "type": "whois", "detail": "Domain registered 2 days ago", "severity": "high", "evidence": "created: 2026-03-04" }
]python-whois— WHOIS lookupsdnspython— DNS queriesnmap(system) +python-nmap— Port scanningshodan— Shodan API (optional; requiresSHODAN_API_KEY)requests— AbuseIPDB / threat intel APIssocket— Reverse DNS
Service: backend/web-analyzer/ — POST http://web-analyzer:5005/api/web-analyzer/
Adapter: orchestrator/app/adapters/web_adapter.py
Analyze URLs and web endpoints for vulnerabilities, misconfigurations, and security weaknesses.
- Full URL string (e.g.,
"http://192.168.1.100/beacon","https://example.com/login")
| Technique | Description |
|---|---|
| HTTP response analysis | Status code, response headers, redirect chain |
| Security header audit | Check for missing CSP, HSTS, X-Frame-Options, etc. |
| Technology fingerprinting | Identify server, framework, CMS versions |
| Cookie security | Inspect Secure, HttpOnly, SameSite flags |
| Basic vuln scanning | nuclei (optional), common path probing |
| TLS/SSL inspection | Certificate validity, weak ciphers |
| URL reputation | VirusTotal URL scan (optional) |
{
"type": "missing_header" | "vuln" | "tech_fingerprint" | "tls_issue" | "redirect" | "cookie" | "error",
"detail": str,
"severity": "info" | "low" | "medium" | "high" | "critical",
"evidence": str
}[
{ "type": "missing_header", "detail": "Content-Security-Policy header absent", "severity": "medium", "evidence": "" },
{ "type": "tech_fingerprint", "detail": "Apache 2.4.49 detected (known CVE)", "severity": "critical", "evidence": "Server: Apache/2.4.49" },
{ "type": "tls_issue", "detail": "TLS 1.0 supported (deprecated)", "severity": "high", "evidence": "TLSv1.0 cipher accepted" }
]requests— HTTP requests and response analysisWappalyzer(orbuiltwith) — Technology fingerprintingnuclei(system, optional) — Template-based vuln scanningsslyzeorssl(stdlib) — TLS/SSL analysisurllib(stdlib) — URL parsing
Each analyzer computes a risk_score (0.0–10.0) for the pass based on the severity distribution of its findings:
| Severity | Weight |
|---|---|
critical |
4.0 |
high |
2.5 |
medium |
1.0 |
low |
0.3 |
info |
0.0 |
Score = min(10.0, sum of severity weights)
The Report Generator computes an overall risk score as the maximum risk score observed across all passes.
- Create a new Docker service directory under
backend/<name>-analyzer/with its ownDockerfileandrequirements.txt. - Add the service to
backend/compose.ymlon thesecflow-netnetwork. - Create
orchestrator/app/adapters/<name>_adapter.pyto translate the service's native response into the SecFlow contract. - Add the analyzer name to the routing rules in
orchestrator/app/classifier/rules.py. - Add the analyzer name to the available tools list in
orchestrator/app/ai/engine.py. - Document the service and its endpoint in this file.