aka Legacy Neural Interface, Legacy Archaeologist
Black Box Whisperer is an AI-powered platform designed to understand, secure, and operationalize opaque legacy systems without modifying them.
It targets mission-critical legacy software that still runs core business operations but has become partially or totally incomprehensible over time due to missing documentation, loss of domain experts, or accumulated implicit behavior.
The goal is not to rewrite or replace these systems, but to reconstruct a usable mental model of how they actually work, based on observable signals. :contentReference[oaicite:0]{index=0}
Many organizations rely on legacy systems that combine:
- old and heterogeneous codebases (COBOL, C, C++, sometimes assembly)
- missing or outdated documentation
- implicit and undocumented dependencies
- poorly understood runtime behavior
- extremely high operational and business risk
These systems often work reliably, but human understanding has been lost.
The real risk is not instability, but invisibility. :contentReference[oaicite:1]{index=1}
Black Box Whisperer ingests non-intrusive, observable artifacts from a legacy system and reconstructs a coherent and exploitable understanding of its behavior.
- legacy source code (partial or complete)
- application and system logs
- existing execution traces
- memory dumps
- observed inputs, outputs, and side effects
No refactoring.
No instrumentation.
No code changes. :contentReference[oaicite:2]{index=2}
Black Box Whisperer is built as a language-agnostic analysis engine surrounded by pluggable adapters.
- each legacy language has unique semantics (COBOL, C, C++, etc.)
- log and trace formats vary widely
- the core engine must remain stable, testable, and reusable
- the core engine is independent of COBOL
- COBOL is the first input adapter (V1)
- additional languages can be added later without rewriting the engine :contentReference[oaicite:3]{index=3}
The first implementation focuses on COBOL systems as the initial vertical slice.
- extraction of symbols (programs, sections, paragraphs)
- detection of calls (
CALL,PERFORM, etc.) - mapping of COPYBOOKs and file dependencies
- identification of input/output operations and database access
Even if COBOL analysis is partial at first, the engine remains valuable because:
- it correlates code structure with runtime behavior
- it produces actionable maps and risk indicators :contentReference[oaicite:4]{index=4}
The core engine produces a verifiable and reproducible foundation:
- normalization of logs and events
- correlation using IDs, time windows, and heuristics
- construction of call graphs and dependency graphs
- hotspot detection and metrics
- evidence-based risk scoring
All of this works without AI. :contentReference[oaicite:5]{index=5}
Black Box Whisperer is implemented as a local, modular analysis platform with a clear separation of concerns between UI, engine, and AI reasoning.
┌──────────────────┐
│ Flutter UI │
│ (Desktop) │
└────────┬─────────┘
│
│ HTTP / JSON (REST API)
│
▼
┌──────────────────────────────────────┐
│ Analysis Engine │
│ Node.js + TypeScript + Fastify │
├──────────────────────────────────────┤
│ • Parsing & Normalization │
│ • Deterministic Analysis Pipelines │
│ • Graph & Correlation Engine │
│ • Risk Scoring & Metrics │
└────────┬─────────────────────────────┘
│
│ Structured Artifacts
│ (JSON, Graphs, Metrics)
│
▼
┌──────────────────────────────────────┐
│ AI Reasoning Layer │
│ Gemini 3 Flash / Pro │
├──────────────────────────────────────┤
│ • Semantic Reconstruction │
│ • Hypothesis Generation │
│ • Risk & Scenario Analysis │
│ • Runbook & Spec Enrichment │
└──────────────────────────────────────┘
- Technology: Flutter (desktop-first: macOS, Linux, Windows)
- Responsibilities:
- project selection and configuration
- triggering analyses
- visual exploration of graphs, risks, and reports
- comparison between analysis runs
- Rendering:
- diagrams via embedded web views (Mermaid, D3, Cytoscape)
- structured reports and evidence navigation
- Communication:
- consumes a local REST API exposed by the engine
- no direct access to AI or parsing logic
Flutter is intentionally used only for presentation and interaction, not for analysis or parsing.
- Runtime: Node.js
- Language: TypeScript (strict mode)
- Server: Fastify (or Express for simpler setups)
- API: REST / JSON over localhost
- Core Responsibilities:
- parsing and normalization of inputs
- deterministic analysis pipelines
- correlation and graph construction
- risk scoring and metrics
- generation of structured artifacts (JSON, Mermaid, Markdown)
The engine can run:
- as a local background service
- headless (CLI or CI mode)
- independently of the UI
This guarantees reproducibility, testability, and automation.
The AI layer operates strictly on top of deterministic artifacts produced by the engine.
- Gemini 3 Flash
- default model
- fast, cost-efficient
- used for low to medium complexity reasoning
- Gemini 3 Pro
- automatically selected when:
- confidence score is below threshold
- task complexity is classified as high
- cross-domain or deep causal reasoning is required
- automatically selected when:
- semantic reconstruction of business logic
- hypothesis generation with explicit evidence references
- failure and cascade scenario reasoning
- runbook enrichment
- specification and test generation
The AI never analyzes raw code or logs directly.
It reasons only on proven, structured, deterministic outputs.
Black Box Whisperer produces structured, actionable deliverables.
- reconstructed business logic (hypotheses + evidence)
- functional and technical diagrams
- detailed data flows
- internal and external dependencies
- classified operational and technical risks
- critical technical debt zones
- probable failure scenarios
- identified breaking points
- human-readable operational runbooks
- OpenAPI specifications derived from observed behavior
- behavior-based non-regression test suites
- living, maintainable documentation :contentReference[oaicite:6]{index=6}
We do not rewrite.
We understand.
We secure.
We encapsulate.
Legacy systems remain untouched, but become understandable, controllable, and integrable. :contentReference[oaicite:7]{index=7}
- auditing critical systems before mergers or acquisitions
- securing legacy applications without operational risk
- accelerating onboarding of new engineering teams
- preparing safe, progressive modernization
- API encapsulation of COBOL systems
- reducing key-person dependency :contentReference[oaicite:8]{index=8}
- Stage IDs (e.g.,
cobol.index,graph.call) are public contracts. - They are recorded in the
artifact_manifest.json. - Any change to a Stage ID is considered a breaking change and invalidates previous runs.
- Cross-run comparisons and AI reasoning rely on these IDs being immutable.
Important
The AI layer NEVER sees raw source code.
- The AI consumes only the structured, deterministic artifacts listed in
artifact_manifest.json. - Direct access to source files or raw logs by the AI model is forbidden by architecture.
- This ensures all insights are traceable to a proved, static evidence artifact produced by the deterministic engine.
Concept / prototype design with a clear execution path:
- V1: deterministic core + COBOL adapter + Flutter UI
- V2: additional adapters (C, C++, Java) and trace connectors
- V3: industrialization (security, reporting, multi-tenant)
To be defined.