A distributed log processing and observability backend that ingests logs asynchronously, processes them through worker services, stores them reliably, and uses an LLM-powered AI agent to analyze logs and generate debugging insights.
Modern distributed applications generate massive volumes of logs across multiple services. Traditional logging systems often struggle with:
- High-throughput ingestion
- Fault-tolerant processing
- Real-time debugging assistance
- Centralized observability
- Intelligent error analysis
This project solves these problems by building a scalable event-driven logging system with asynchronous processing and AI-powered debugging assistance.
The system:
- Collects logs from services through REST APIs
- Queues logs using RabbitMQ
- Processes logs asynchronously through workers
- Stores logs in PostgreSQL
- Uses Redis for caching and session memory
- Integrates Llama-3 via Groq for AI-based log analysis
+------------------+
| Client Services |
| / Applications |
+--------+---------+
|
v
+--------------------+
| FastAPI Ingestion |
| Service |
+---------+----------+
|
v
+-------------+
| RabbitMQ |
| Message Bus |
+------+------+
|
+-----------------+------------------+
| |
v v
+----------------------+ +----------------------+
| Log Processing Worker| | Retry / Failure Queue|
| (.NET / C#) | | |
+----------+-----------+ +----------------------+
|
v
+----------------------+
| PostgreSQL Database |
| Log Persistence |
+----------+-----------+
|
v
+----------------------+
| AI Debugging Agent |
| (Llama-3 via Groq) |
+----------+-----------+
|
+------+------+
| Redis Cache |
| Session Mem |
+-------------+
- Python
- FastAPI
- C#
- .NET
- RabbitMQ
- PostgreSQL
- Redis
- Llama-3
- Groq API
- REST APIs
Applications send logs to the FastAPI ingestion service through REST APIs.
The ingestion service publishes logs to RabbitMQ queues for asynchronous processing.
Worker services consume logs from queues and process them independently.
This ensures:
- Scalability
- Decoupled architecture
- Fault tolerance
- Better throughput
Processed logs are stored in PostgreSQL for querying and analysis.
Failed log processing attempts are retried through retry queues.
The AI agent:
- Fetches logs through APIs
- Understands system errors
- Generates debugging insights
- Explains possible root causes
- Suggests fixes
Redis is used for:
- Response caching
- Session memory
- Faster repeated queries
- Distributed log ingestion architecture
- Event-driven asynchronous processing
- RabbitMQ-based decoupled communication
- Retry and failure queue handling
- PostgreSQL log persistence
- AI-powered debugging assistant
- Redis caching and conversational memory
- REST API integration
- Scalable worker-based design
- Fault-tolerant processing pipeline
git clone https://github.com/yourusername/repository-name.git
cd repository-namecd ingestion-service
python -m venv venv
# Windows
venv\Scripts\activate
# Linux/Mac
source venv/bin/activate
pip install -r requirements.txt
uvicorn main:app --reloadcd worker-service
dotnet restore
dotnet runUsing Docker:
docker run -d \
--hostname rabbitmq \
--name rabbitmq \
-p 5672:5672 \
-p 15672:15672 \
rabbitmq:3-managementRabbitMQ Dashboard:
Default credentials:
username: guest
password: guest
Create database:
CREATE DATABASE logsdb;Update connection string in configuration files.
Using Docker:
docker run -d -p 6379:6379 redisCreate .env:
DATABASE_URL=your_postgres_url
RABBITMQ_URL=your_rabbitmq_url
REDIS_URL=your_redis_url
GROQ_API_KEY=your_api_keyAdd screenshot here:
screenshots/log-ingestion.pngAdd screenshot here:
screenshots/rabbitmq-dashboard.pngAdd screenshot here:
screenshots/postgres-logs.pngAdd screenshot here:
screenshots/ai-analysis.png- Real-time log streaming dashboard
- Kubernetes deployment support
- OpenTelemetry integration
- Elasticsearch support
- Grafana visualization
- Role-based authentication
- Multi-tenant architecture
- AI anomaly detection
- Alerting system
- Vector database integration for semantic log search
- Distributed tracing support
- Docker Compose production setup
"Why are payment requests failing with HTTP 500 errors?"
Possible root cause:
- Database connection pool exhaustion
Detected patterns:
- Increased timeout exceptions
- Spike in failed queries
Suggested fixes:
- Increase DB pool size
- Add retry logic
- Optimize slow queries
Your Name
GitHub: https://github.com/yourusername