Multi-stage, AI-driven XDR (Extended Detection and Response) pipeline that ingests security logs, maps them into a Graph+Vector hybrid database, calculates deterministic threat paths, and executes agentic mitigation.
- Database: Supabase with Postgres + pgvector for hybrid graph+vector storage
- RAG Pipeline: Hybrid search combining topological graph queries and semantic search
- Knowledge Base: Curated threat intelligence with 12+ attack patterns and mitigation strategies
- API: FastAPI backend with endpoints for log ingestion, threat analysis, and graph queries
cd Argus-XDR
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt# Copy example environment file
cp .env.example .env
# Edit .env with your Supabase credentials
# SUPABASE_URL=https://your-project.supabase.co
# SUPABASE_KEY=your-anon-key# Apply migrations to Supabase
# 1. Go to Supabase console -> SQL Editor
# 2. Create a new query
# 3. Copy contents of db_migrations/01_init_schema.sql
# 4. Execute the query# Start the server
python -m uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000
# Server will be available at http://localhost:8000
# API docs at http://localhost:8000/docs
# OpenAPI spec at http://localhost:8000/openapi.jsonGET /health- Service health statusGET /ready- Readiness check with component verification
POST /api/ingest/logs- Ingest batch security logsPOST /api/ingest/cloud-audit- Ingest cloud provider audit logsGET /api/ingest/status- Ingestion pipeline status
POST /api/threats/analyze- Analyze security alert using hybrid retrievalGET /api/threats/patterns/{threat_id}- Get similar past threatsGET /api/threats/graph/{entity_id}- Get entity's connected graphGET /api/threats/entities- List all threat entities
curl -X POST http://localhost:8000/api/ingest/logs \
-H "Content-Type: application/json" \
-d '{
"logs": [
{
"timestamp": "2024-04-13T10:00:00Z",
"event_type": "login",
"severity": "info",
"source_ip": "192.168.1.100",
"source_user": "admin",
"target_resource": "server1",
"action": "login",
"metadata": {}
}
],
"source_type": "custom_json"
}'curl -X POST http://localhost:8000/api/threats/analyze \
-H "Content-Type: application/json" \
-d '{
"query": "What is suspicious activity from IP 192.168.1.100?",
"top_k": 5,
"include_topological": true
}'argus-xdr/
├── db_migrations/
│ └── 01_init_schema.sql # Database schema with pgvector
├── backend/
│ ├── core/
│ │ ├── config.py # Configuration management
│ │ └── database.py # Supabase client & operations
│ ├── rag/
│ │ ├── embedder.py # Vector embeddings (MiniLM)
│ │ ├── retriever.py # Hybrid search with RRF
│ │ └── knowledge.py # Threat intelligence KB
│ ├── pipeline/
│ │ ├── parser.py # Log parsing & normalization
│ │ └── graph_builder.py # Entity extraction & graph construction
│ ├── api/
│ │ ├── routes_ingest.py # Log ingestion endpoints
│ │ └── routes_threats.py # Threat analysis endpoints
│ └── main.py # FastAPI application
├── tests/ # Unit & integration tests
├── requirements.txt # Python dependencies
├── .env.example # Configuration template
└── README.md # This file
- Logs: Raw security events with structured JSON
- Nodes: Unique entities (IP, User, Host, File, Process, Domain, URL) with embeddings
- Edges: Relationships between entities with confidence scores
- Embeddings: Vector representations of log payloads and threat summaries
- Knowledge: Curated threat intelligence entries with embeddings
- Topological Search: SQL queries on graph structure
- Semantic Search: pgvector similarity queries
- Keyword Search: BM25-based full-text search
- RRF Merging: Reciprocal Rank Fusion to combine results
- Cross-Encoder Reranking: Final ranking with cross-encoder model
- 12+ attack patterns (lateral movement, exfiltration, privilege escalation, persistence, defense evasion)
- 2+ detection rules for identifying suspicious activities
- 2+ mitigation strategies for containment and hardening
- Agent orchestrator with ReAct loop
- Graph traversal algorithms (BFS, Dijkstra) for attack path detection
- LLM-based threat summarization and reporting
- Automated response actions
- Dashboard and visualization
pytest tests/ -v# Format code
black backend/
# Sort imports
isort backend/
# Lint
flake8 backend/- Keep
.envfile out of version control - Use service role key only in secure backend environments
- Implement proper authentication before production deployment
- Enable network segmentation for database access
- Rotate API keys regularly
Proprietary - Argus XDR Security Pipeline
For issues and questions, contact the security team.