NIO-Weaver is a graph-augmented analytics pipeline that converts natural-language CAN analysis requests into executable SQL workflows.
| Item | Description |
|---|---|
| Domain | Vehicle CAN signal analytics |
| Input | Natural-language analysis request |
| Output | Parsed signals, intermediate artifacts, and executable SQL |
| Core stack | LLM + embedding retrieval + graph reasoning + SQL generation |
- Architecture
- Project Structure
- Requirements
- Quick Start
- Configuration Guide
- Common Workflows
- Outputs
- Troubleshooting
- Build a domain graph from signal metadata.
- Parse user intent and relevant entities from a query.
- Retrieve candidate signals and context.
- Generate SQL with decomposition and iterative refinement.
- Optionally run candidate voting to select the best final SQL.
NIO-Weaver/
├─ LLMAsAnalyst.py # Main orchestration class
├─ main.py # Example pipeline script
├─ prompts/
│ └─ prompt/
│ ├─ prompt.yaml # Core prompts
│ ├─ decompose.yaml # SQL decomposition prompts
│ └─ finalvote.yaml # Candidate SQL evaluation prompts
├─ src/
│ ├─ config.yaml # Runtime configuration
│ ├─ graphbuild.py # Graph building pipeline
│ ├─ signalparser.py # Query/signal parsing pipeline
│ ├─ sqlgenerate.py # SQL generation pipeline
│ ├─ Models/ # LLM + embedding implementations
│ ├─ GraphBuilderModules/ # Graph-building submodules
│ ├─ SignalSelectModules/ # Signal-retrieval submodules
│ └─ utils/ # Shared utilities
└─ README.md
- Python 3.10+ (recommended)
- Neo4j
- LLM provider credentials
pyyaml, python-dotenv, pandas, tqdm, openai, anthropic, tiktoken, sentence-transformers, aiohttp, neo4j
from LLMAsAnalyst import LLMAsAnalyst
analyst = LLMAsAnalyst()analyst.buildgraph()query = "Analyze MAI activation events under specific driving conditions."
analyst.signalparse(query)analyst.batch_generate_sql()
analyst.batch_iterative_sql_generation()analyst.batch_final_sql_vote()Configuration file: src/config.yaml
| Section | Purpose |
|---|---|
llm.* |
LLM provider/model/API settings |
embedding_model.* |
Embedding provider/model/API settings |
neo4j.* |
Graph database connection |
entity_extraction |
Entity extraction behavior |
entity_linking |
Entity linking behavior |
signal_parse |
Signal parsing and retrieval settings |
sql_generate |
SQL generation settings |
log_files.* |
Log output paths |
Before running:
- Fill
api_keyandmodelin config. - Confirm Neo4j connection settings.
- Keep prompt files under
prompts/prompt/unless you also update loader logic.
analyst.batch_process_queries(
input_file="data/input/queries.xlsx",
checkpoint_dir="data/output/batch_results/checkpoints"
)analyst.batch_debug_queries(
input_file="data/input/queries.xlsx",
checkpoint_dir="data/process/debug_checkpoints"
)result = analyst.single_iterative_sql_generation(
query_index=0,
retry_times=5,
max_iterations=3,
top_k_additional=6
)Typical outputs are generated under data/output/:
- parsed signal artifacts
- SQL generation results
- iterative checkpoints
- evaluation/voting outputs
Logs are written under logs/ according to src/config.yaml.
- Prompt YAML files are runtime dependencies. Keep syntax valid.
- Batch APIs with checkpointing are recommended for long jobs.
- Validate generated SQL in your target engine before production usage.
- In production, avoid logging raw user-sensitive content.
| Symptom | Likely Cause | Action |
|---|---|---|
ModuleNotFoundError |
Missing dependency | Install required Python packages |
| Neo4j connection failed | Invalid neo4j.* settings |
Verify URI/user/password |
| LLM request failed | Invalid provider or API config | Check llm.provider/model/api_key/api_base |
| Prompt loading failed | Missing prompt files | Confirm prompts/prompt/*.yaml exists |