JIDRA = Java Integrated Graph Reduction & Analysis
JIDRA is a structured context backend that reduces LLM input tokens by 73-95% for code-native queries by giving Claude a pre-analyzed call graph instead of raw source files.
Real Claude Code sessions, same question, same model (claude-sonnet-4-6 1M):
Without JIDRA: 833,782 input tokens ($0.2298) — Claude read files manually
With JIDRA: 227,095 input tokens ($0.2275) — Claude used graph tools
Reduction: 72.8% fewer input tokens, same answer quality
At Opus pricing ($15/M input):
Without JIDRA: $12.51/query
With JIDRA: $3.41/query
Savings: $9.10/query → $4,550/year at 500 queries
This project is intentionally focused and graph-driven.
- Index once → Get a deterministic, validated call graph (AST + Spring Actuator)
- Reduce noise → Remove 71-78% phantom edges via runtime bean validation
- Generate context → 73-95% smaller prompt-ready context for Claude/Codex/Gemini
- Trace execution → See likely business flow with uncertainty markers
- Reduce LLM cost → Proven token reduction on code-native workflows (measured on real projects)
Real Proof — Claude Code sessions, same question, same model (claude-sonnet-4-6):
| Session | Input tokens | Output tokens | Cost |
|---|---|---|---|
| Without JIDRA (filesystem tools only) | 833,782 | 5,161 | $0.2298 |
| With JIDRA (MCP graph tools) | 227,095 | 1,784 | $0.2275 |
| Reduction | 72.8% | 65.4% | ~same |
Same cost today at Sonnet pricing because output tokens dominate — but 606k fewer input tokens per query. At Opus pricing ($15/M input vs $3/M) that gap is $9.09 saved per query.
- Indexes Java source into a deterministic call graph
- Validates with Spring Actuator to remove phantom edges
- Generates noise-free context (87-95% smaller)
- Traces method/route execution with uncertainty markers
- Exports as JSON, MCP tools, or interactive HTML
- Integrates with Claude/Codex via MCP
- Reduces LLM token costs by 87-95% (proven)
- ❌ Autonomous agent loops - Claude already does this; we provide context
- ❌ Multi-service distributed reasoning - Requires service mesh, not code analysis
- ❌ Interactive debugging sessions - Single-shot context generation (not loops)
- ❌ Config-driven behavior analysis - YAML/JSON parsing planned for v2.0
- ❌ Full semantic Java correctness - AST + Actuator validation is best-effort
Bottom line: JIDRA is infrastructure FOR agents, not a replacement agent.
jidra/
├── pyproject.toml
├── requirements.txt
├── README.md
└── jidra/
├── __init__.py
├── cli.py
├── config.yaml
├── llm_client.py
├── models.py
├── graph_io.py
├── selector.py
├── trace_engine.py
├── context_builder.py
├── extractor.py
├── exporter.py
├── filters.py
└── cache.py
This project is released under the MIT License (see LICENSE).
From project root:
cd scripts/jidra
pip install -e .If you use the local venv:
./venv/bin/pip install -e .Some features (like error-doc choosing the first "project" stack frame as an anchor) can use
package prefixes to distinguish your code from third-party libraries.
Set a comma-separated list:
git clone https://github.com/akhilsinghcodes/jidra.git cd jidra
If unset, JIDRA treats any package as project code for anchoring.
python -m jidra.cli index \
--codebase /path/to/java/repo \
--output /tmp/graph.jsonlWhen output is a directory, JIDRA writes:
graph.jsonl(main)graph_test.jsonl(test)
python -m jidra.cli trace \
--graph /tmp/graph.jsonl \
--method com.example.Controller.searchpython -m jidra.cli context \
--graph /tmp/graph.jsonl \
--method com.example.Controller.searchpython -m jidra.cli prompt \
--graph /tmp/graph.jsonl \
--method com.example.Controller.search \
--target codexpython -m jidra.cli diagnose \
--graph /tmp/graph.jsonl \
--method com.example.Controller.search \
--target codex \
--llm-profile localFor trace, context, trace-route, prompt, diagnose:
--graphprovided: used directly--graphomitted: selected by--graph-type(maindefault)main->jidra/output/graph.jsonltest->jidra/output/graph_test.jsonl
Supported method selectors:
- method id
- full signature
- full class + method (
com.example.Class.method) - short class + method (
Class.method) - bare method name (if unique)
Ambiguous selector output includes candidate ids you can use directly.
Purpose: validate static call graph against a running Spring Boot app's actuator beans, filtering out phantom edges to uninstantiated classes.
jidra validate \
[--graph <path>] \
[--graph-type main|test] \
[--actuator-url <url>] \
[--codebase <path>] \
[--port 8080] \
[--timeout 120] \
[--output <path>] \
[--report <path>] \
[--no-filter]Behavior:
--actuator-urlprovided: connect directly to running app (e.g., http://localhost:8080)--codebaseprovided: auto-build Docker image, run container, query actuator, cleanup- Must provide one of
--actuator-urlor--codebase - Fetches
/actuator/beansto extract confirmed bean class names - Filters graph: removes edges to non-bean classes, removes CallSites pointing to non-beans
- Upgrades unresolved CallSites where receiver type matches a confirmed bean
- Outputs:
graph_validated.jsonl(filtered graph) + JSON report
Filtering logic:
- Extract confirmed bean class names from actuator response
- Remove
ResolvedCallEdgewhere callee method's class is not a confirmed bean - Remove
CallSiterecords where all resolved_candidates point to non-beans - Upgrade
CallSitewith statusunresolved_receiverif receiver type matches a bean - Keep all class/method nodes for context (not all classes are beans)
Example: Direct URL
jidra validate \
--actuator-url http://localhost:8080 \
--graph /path/to/graph.jsonl \
--output /path/to/output \
--report /path/to/report.jsonExample: Auto Docker build+run (always does clean Java build)
jidra validate \
--codebase /path/to/java/repo \
--graph /path/to/graph.jsonl \
--port 8080 \
--timeout 120jidra automatically:
- Detects
./gradleworpom.xml(gradle or maven) - Runs
./gradlew clean build -x testor./mvnw clean package -DskipTests - Builds Docker image
- Runs container and queries actuator
- Cleans up Docker resources
To skip the Java build (if you've already done gradle build manually):
jidra validate --codebase /path/to/java/repo --graph /path/to/graph.jsonl --skip-buildExample: Debug mode (report removals, don't filter)
jidra validate \
--actuator-url http://localhost:8080 \
--graph /path/to/graph.jsonl \
--no-filter \
--report /tmp/validation_debug.jsonReport output shape:
{
"total_classes": 412,
"confirmed_beans": 87,
"unconfirmed_classes_sample": ["com.example.Dto", ...],
"edges_before": 1843,
"edges_after": 1201,
"edges_removed": 642,
"callsites_upgraded": 14,
"removed_edges_sample": [
{"caller": "...", "callee": "..."}
]
}Purpose: generate deterministic flow investigation markdown from indexed graph data (no LLM calls).
jidra flow-doc \
[--graph <path>] \
[--graph-type main|test] \
--method <selector> \
--output <markdown-path> \
[--depth 4] \
[--top-n 8] \
[--max-subflows 8] \
[--mind-map] \
[--max-nodes 200] \
[--include-details] \
[--include-utility]Behavior:
- Normal mode (no
--mind-map): prioritized flow slices usingtop_nandmax_subflows. --mind-mapmode: recursive resolved-edge traversal usingdepth + max_nodes; it does not usetop_n/max_subflowsfor traversal.--include-details: in--mind-mapmode, appends legacy detailed expanded sections that still use prioritized slicing (top_n/max_subflows).- Output is deterministic for the same graph + method + flags.
Examples:
python -m jidra.cli flow-doc \
--method SearchController.search \
--output flow_docs/verify_SearchController_search.md \
--depth 10 \
--top-n 10 \
--max-subflows 10 \
--show-agentspython -m jidra.cli flow-doc \
--method SearchController.search \
--output flow_docs/mindmap_SearchController_search.md \
--mind-map \
--depth 6 \
--max-nodes 120Purpose: generate deterministic error investigation markdown from a Java stack trace text file and indexed graph.
jidra error-doc \
--stack-trace <stack-trace.txt> \
--output <markdown-path> \
[--graph <path>] \
[--graph-type main|test] \
[--depth 6] \
[--max-nodes 200] \
[--mind-map]Stack frame parsing:
- Parses lines in format:
at package.Class.method(File.java:123).
Frame-to-method matching:
- class full name
- method name
- file name
- line in method
[start_line, end_line]
Match semantics:
matched: exactly one graph method candidate.ambiguous: multiple candidates (reported as ambiguity).unmatched: no candidate.
Anchor + focused map:
- primary failure anchor: first matched/ambiguous project frame.
- focused flow map: generated via deterministic
flow-docmind-map traversal around anchor. - upstream/downstream behavior:
- downstream-focused when anchor has meaningful downstream callees.
- upstream-focused fallback when downstream is weak.
Examples:
python -m jidra.cli error-doc \
--stack-trace examples/error_1.txt \
--output flow_docs/error_doc_verify_clean.md \
--mind-map \
--depth 6 \
--max-nodes 80- Static analysis only; runtime dispatch is not guaranteed.
- Unresolved calls may remain in outputs.
- External library frames/methods may be unmatched.
- Graph quality directly affects output quality.
- No runtime correctness claims; output is investigation guidance.
## Suggested Debug Locations
| priority | location | reason |
|---:|---|---|
| 1 | `com.example.app.health.HealthIndicator#doHealthCheck(Health.Builder)` | failing project frame |
| 2 | `org.opensearch.client.opensearch.cluster.OpenSearchClusterClient#health:360` | caller frame above failure |
| 3 | `this.client.cluster().health` | unresolved external call near failure |jidra index --codebase <path> --output <path-or-dir>Builds graph JSONL from Java source using tree-sitter parser pipeline.
jidra trace \
[--graph <path>] \
[--graph-type main|test] \
--method <selector> \
[--max-depth 5] \
[--business-only] \
[--output <file-or-dir>]--business-onlyfilters support/metrics/logging from flow output- root node is always preserved
jidra context \
[--graph <path>] \
[--graph-type main|test] \
--method <selector> \
[--max-chars 12000] \
[--max-tokens <int>] \
[--business-only] \
[--output <file-or-dir>]Includes:
- method signature/source
- endpoint metadata
- resolved callee summary
- unresolved call summary
Context output is deduped/grouped for prompt readiness.
jidra trace-route \
[--graph <path>] \
[--graph-type main|test] \
--route <path> \
[--max-depth 5] \
[--output <file-or-dir>]jidra prompt \
[--graph <path>] \
[--graph-type main|test] \
--method <selector> \
[--max-chars 12000] \
[--max-tokens <int>] \
[--business-only|--no-business-only] \
[--target claude|codex|generic] \
[--output <file-or-dir>]Default: --business-only is enabled.
jidra diagnose \
[--graph <path>] \
[--graph-type main|test] \
--method <selector> \
[--target claude|codex|generic] \
[--model <model>] \
[--max-chars 12000] \
[--max-tokens <int>] \
[--business-only|--no-business-only] \
[--llm-profile local|enterprise] \
[--config <path-to-config.yaml>] \
[--show-prompt] \
[--quiet] \
[--output <file-or-dir>]Behavior:
- No
--output+ interactive TTY + not--quiet: ANSI-readable report - No
--output+ non-TTY or--quiet: JSON printed - With
--output: JSON written to file --show-prompt: includes prompt text in result JSON--max-chars: controls method context/source size sent into prompt construction--max-tokens: overrides model output token limit for this run (when omitted, config profile default is used)
When --output is a directory:
- trace:
trace_<graph_type>_<method>.json - trace + business-only:
trace_business_<graph_type>_<method>.json - context:
context_<graph_type>_<method>.json - context + business-only:
context_business_<graph_type>_<method>.json - trace-route:
trace_route_<graph_type>_<route_or_entry>.json - prompt:
prompt_<target>_<graph_type>_<method>.txt - diagnose:
diagnose_<target>_<graph_type>_<method>.json
Names are normalized to lowercase snake-style safe parts.
JIDRA uses jidra/config.yaml.
Example:
llm:
provider: litellm
profile: local
profiles:
local:
api_base: "http://localhost:4000"
api_key_env: "LITELLM_PROXY_API_KEY"
default_model: "ollama/gemma4:e4b"
timeout_seconds: 120
temperature: 0.2
max_tokens: 1200
enterprise:
api_base: "https://your-enterprise-litellm.example.com"
api_key_env: "ENTERPRISE_LITELLM_API_KEY"
default_model: "gpt-4o-mini"
timeout_seconds: 120
temperature: 0.2
max_tokens: 2000Rules:
- Default profile comes from
llm.profile - CLI override:
--llm-profile - If
api_key_envis set, env var is read - Missing config falls back to safe local defaults
diagnose returns JSON with:
{
"method": "...",
"analysis": "...",
"llm": {
"provider": "litellm",
"profile": "local",
"model": "...",
"usage": {
"input_tokens": 0,
"output_tokens": 0,
"total_tokens": 0,
"reasoning_tokens": 0
},
"latency_seconds": 0.0,
"limits": {
"max_chars": 12000,
"max_tokens": null
}
},
"context_summary": {
"business_flow_count": 0,
"unresolved_count": 0
}
}If provider usage is unavailable, token counts are estimated and:
"estimated": trueis added under llm.usage.
--max-chars(context, prompt, diagnose):- default
12000 - passed directly to context building to constrain context payload size
- default
--max-tokens(context, prompt, diagnose):- optional CLI override
- primarily used by
diagnoseto cap LLM output tokens - if omitted, profile default from
jidra/config.yamlis used
Likely LLM connectivity issue:
- verify LiteLLM endpoint in config
- verify API key/env key
- verify network access to endpoint
Use a stronger selector:
- class+method or exact method id from ambiguity output
No endpoint matched that route in graph. Validate route annotations and graph source set.
Check Python/venv and package index/network availability.
JIDRA includes a cost calculator that measures actual token savings from your real codebase — not estimates.
# Graph-wide averages
jidra cost-roi --model claude-opus-4-7 --queries 1000
# Specific method — reads real source files, no API calls
jidra cost-roi --method SearchController.search --model claude-opus-4-7 --queries 1000
# Specific method — real Claude API calls, exact token counts (requires ANTHROPIC_API_KEY)
jidra cost-roi \
--method SearchController.search \
--codebase /path/to/java-repo \
--model claude-opus-4-7 \
--queries 1000 \
--offline falseSee COST_ROI_CALCULATOR.md for full usage and how the measurement works.
Two scripts in validations/ let you prove JIDRA's value on your own codebase.
Measures real token savings via Claude API calls — traditional raw source vs JIDRA context.
ANTHROPIC_API_KEY=... python validations/run_validation.py \
--graph /path/to/.jidra/graph_validated.jsonl \
--codebase /path/to/your-repo \
--methods "OrderController.createOrder,PaymentService.charge" \
--model claude-opus-4-7 \
--output validations/results.jsonTests whether JIDRA reduces hallucinations and model drift across 5 dimensions:
- Call graph accuracy — does the model correctly name what a method calls?
- Caller tracing — does the model correctly name what calls a method?
- Change impact — does the model correctly identify what breaks if a method changes?
- Unit test generation — do generated tests reference real symbols?
- Consistency/drift — does the model give the same answer twice in separate sessions?
# Pass your own methods inline
ANTHROPIC_API_KEY=... python validations/hallucination_test.py \
--graph /path/to/.jidra/graph_validated.jsonl \
--codebase /path/to/your-repo \
--methods "OrderController.createOrder,PaymentService.charge"
# Pass methods via file (one per line)
ANTHROPIC_API_KEY=... python validations/hallucination_test.py \
--graph /path/to/.jidra/graph_validated.jsonl \
--codebase /path/to/your-repo \
--methods-file my_methods.txt
# Auto-discover endpoints from the graph
ANTHROPIC_API_KEY=... python validations/hallucination_test.py \
--graph /path/to/.jidra/graph_validated.jsonl \
--codebase /path/to/your-repo \
--auto-discover --discover-limit 5
# Run specific tests only (e.g. unit test gen + drift)
--tests 4,5Methods file format (my_methods.txt):
# One selector per line — ClassName.methodName or fully qualified
OrderController.createOrder
PaymentService.charge
com.example.search.SearchController.search
Aggregate output:
hallucination_rate traditional=0.42 jidra=0.08 improvement=+81.0%
fabrication_rate traditional=0.35 jidra=0.06 improvement=+82.9%
drift_score traditional=0.28 jidra=0.04 improvement=+85.7%
# All tests
python -m pytest tests/ -v
# Cost calculator only
python -m pytest tests/test_cost_calculator.py -v
# Unit tests only (no graph file needed)
python -m pytest tests/test_cost_calculator.py -v -k "not real and not missing"See tests/README.md for the full test structure.
cli.pyhandles command orchestration only.llm_client.pyowns provider/config/use-metrics behavior.- graph extraction and graph format are intentionally unchanged.