Skip to content

Nitish-Naik/LogIQ

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LogIQ — Distributed Logging & Data Processing System

A hands-on exploration of how production logging pipelines behave under real-world constraints.

LogIQ is a distributed, event-driven logging system designed to understand high-throughput ingestion, asynchronous processing, and failure handling in modern data pipelines.


Why I Built This

Most logging tools abstract away complexity. I wanted to deeply understand:

  • How logs flow through a distributed system
  • How to design reliable ingestion pipelines
  • How systems behave under failure and retries
  • How to handle backpressure in async pipelines
  • How to structure data for efficient querying and streaming

This project focuses on system behavior and tradeoffs, not just feature implementation.


Architecture Overview

Producers → Collector → Redis Streams → Processor → PostgreSQL → Query Service → Dashboard

Flow

  1. Applications send logs to the Collector
  2. Collector validates API keys and pushes events to Redis Streams
  3. Processor workers consume streams asynchronously
  4. Logs are persisted in PostgreSQL
  5. Query service provides REST + WebSocket APIs
  6. Dashboard streams and visualizes logs in real time

Key Design Decisions

Redis Streams (vs Kafka)

  • Simpler setup for a local-first system
  • Built-in support for consumer groups
  • Tradeoff: lower scalability ceiling compared to Kafka

Asynchronous Workers

  • Decouples ingestion from persistence
  • Enables retry logic and failure isolation
  • Improves system resilience under load

PostgreSQL Storage

  • Strong querying capabilities with indexing
  • Reliable and familiar storage model
  • Tradeoff: not optimized for extremely high write throughput

System Behavior

  • Supports concurrent ingestion and async processing
  • Designed for horizontal scalability via worker processes
  • Uses stream buffering to handle traffic spikes
  • Real-time log streaming via WebSockets

Challenges

  • Designing idempotent processing to handle retries
  • Preventing duplicate event writes
  • Managing backpressure in async systems
  • Balancing real-time streaming vs database consistency

Key Takeaways

  • Distributed systems must be designed for failure, not just success
  • Idempotency is essential once retries are introduced
  • Backpressure naturally emerges in async pipelines
  • Decoupling ingestion and processing improves reliability but adds complexity

This project was built to explore system behavior under load, not just to implement features.


Services

Service Purpose
Auth Service API keys, authentication, user management
Collector Log ingestion endpoint
Processor Stream processing + persistence
Query Service Query APIs + real-time streaming
Dashboard UI for logs and monitoring

Tech Stack

  • Backend: Node.js, Express
  • Queue: Redis Streams
  • Database: PostgreSQL
  • Frontend: React
  • Infrastructure: Docker

Quick Start

# Start infrastructure (Postgres + Redis)
docker compose up -d

# Start services
cd collector && npm install && npm run dev
cd processor && npm install && npm run dev
cd query-service && npm install && npm run dev
cd dashboard && npm install && npm run dev

What This Project Demonstrates

  • Event-driven architecture
  • Distributed system design fundamentals
  • Asynchronous data pipelines
  • Failure handling and retries
  • Real-time data streaming
  • Backend architecture tradeoffs

Future Improvements

  • Kafka-based ingestion for higher scalability
  • Stream partitioning and load balancing
  • Advanced indexing strategies
  • Rate limiting and ingestion throttling
  • Metrics + observability (Prometheus/Grafana)

Repository

GitHub: https://github.com/Nitish-Naik

The goal of the project is to provide a working end-to-end observability pipeline for learning and demos:

  • applications send logs to a collector
  • the collector validates the request and writes to Redis/PostgreSQL
  • the query service reads and streams logs for the dashboard
  • the auth service issues API keys and user/session tokens
  • the dashboard provides a UI for browsing logs and managing accounts

Architecture

Client apps -> Collector -> Redis / PostgreSQL -> Query service -> Dashboard
                    \
                     -> Auth service -> API key / auth flows

Supporting docs:

Services

Service Path Default port Purpose
Auth service auth-service/ 3003 Signup, signin, token verification, password reset, API key-related auth flows
Collector collector/ 4000 Receives log events and validates API keys
Query service query-service/ 5000 REST + WebSocket log querying and live streaming
Dashboard dashboard/ 5173 React UI for viewing logs and account pages
Sample apps sample-app-requests/ varies Demo senders and integration examples

Prerequisites

  • Node.js 18 or newer
  • npm
  • Docker and Docker Compose

Quick Start

  1. Start the infrastructure dependencies:
docker compose up -d

This starts PostgreSQL, Redis, Redis Commander, and pgAdmin.

  1. Install dependencies for each service you want to run:
cd auth-service && npm install
cd ../collector && npm install
cd ../query-service && npm install
cd ../dashboard && npm install
cd ../sample-app-requests && npm install
  1. Start the backend services in separate terminals:
cd auth-service && npm run dev
cd collector && npm run dev
cd query-service && npm run dev
  1. Start the dashboard:
cd dashboard && npm run dev
  1. Optionally run one of the sample apps to generate logs:
cd sample-app-requests && npm run app1

Environment Variables

Most services load a local .env file. The common variables are:

Variable Used by Default / example
PORT auth service, collector, query service 3003, 4000, or 5000
DB_HOST auth service, collector localhost
DB_PORT auth service, collector 5432
DB_NAME auth service, collector logsdb
DB_USER auth service, collector devlogs
DB_PASSWORD auth service, collector devlogs
JWT_SECRET auth service required for signed tokens in production
JWT_EXPIRES_IN auth service 24h
REDIS_URL collector, query service Redis connection string
REDIS_STREAM_KEY collector Redis stream key for incoming logs
PG_URL query service PostgreSQL connection string
COLLECTOR_URL sample apps usually http://localhost:4000/logs
LOG_API_KEY sample apps API key generated by the auth flow
LOG_API_KEY2 sample app 2 secondary demo key

Main Endpoints

Auth service

Base URL: http://localhost:3003/api/auth

  • POST /signup
  • POST /signin
  • POST /verify
  • GET /me
  • POST /refresh-token
  • POST /forgot-password
  • POST /reset-password
  • POST /verify-reset-token
  • GET /health at http://localhost:3003/health

Collector

Base URL: http://localhost:4000

  • POST /logs - API-key protected ingestion endpoint
  • GET / - simple service health response

Query service

Base URL: http://localhost:5000

  • GET /logs - filtered log search
  • GET /logs/all - paginated log listing
  • GET /logs/count - total log count
  • GET /getOrgDetails - organization metadata endpoint used by the UI
  • WS /ws - live log stream for connected clients

Sample Apps

The sample-app-requests/ folder contains demo senders and test harnesses:

  • npm run app1 - e-commerce style demo app
  • npm run app2 - alternate demo app using a second API key
  • npm run app3 - additional sample sender
  • npm run test-api-key - API key authentication test script

Before running a sample app, create a local .env file in that folder with your collector URL and API key.

Database and Admin Tools

The Docker Compose stack provides:

  • PostgreSQL on localhost:5432
  • Redis on localhost:6379
  • Redis Commander on http://localhost:8081
  • pgAdmin on http://localhost:8082

Default database credentials in the compose file:

  • user: devlogs
  • password: devlogs
  • database: logsdb

Project Layout

  • auth-service/ - user auth, password reset, token issuance
  • collector/ - API-key protected ingestion service
  • query-service/ - query API and live log broadcasting
  • dashboard/ - React frontend
  • sample-app-requests/ - demo senders and integration examples
  • db/ - schema files and database helpers
  • processor/ - stream processor entry point and services
  • docs/ - additional documentation

Troubleshooting

  • If the auth or collector service cannot connect to PostgreSQL, confirm docker compose up -d completed successfully.
  • If log ingestion fails, check REDIS_URL, REDIS_STREAM_KEY, and the X-API-Key header used by the client.
  • If the dashboard cannot load live logs, verify the query service is running and Redis is reachable.
  • If your API key is missing from a sample app, regenerate it from the auth flow and place it in the sample app .env file.

Next Steps

If you want to extend the system, the usual follow-ups are:

  1. Add a root-level .env.example for all services.
  2. Add unified start scripts so the stack can be launched from one command.
  3. Tighten the query-service and collector documentation around their request/response formats.

LogIQ — Distributed Local Logging Platform

LogIQ (Instant Dev Logs) is a self-hosted, modular observability stack designed for learning, demos, and local development. It demonstrates an end-to-end logging pipeline with secure ingestion, streaming, durable persistence, querying, and a UI.

This README is a high-level guide. For implementation details, see the service folders.

Key goals

  • Provide a simple, extensible logging pipeline for experimenting with ingestion, streaming, and querying.
  • Offer clear integration points for AI features (summaries, semantic search) while protecting privacy.
  • Demonstrate production-minded patterns: API keys, consumer-groups, DLQ, and migrations.

Architecture overview

Client apps → Collector (HTTP) → Redis stream → Processor → PostgreSQL → Query service → Dashboard

Auth service manages users and API keys used by clients to authenticate ingestion requests.

Key repository documents:

Services

Service Path Default port Purpose
Auth service auth-service 3003 User/account management and API-key lifecycle
Collector collector 4000 Accepts logs (API-key protected) and writes to Redis streams
Processor processor n/a Consumes Redis streams, persists logs to Postgres, publishes live events
Query service query-service 5000 Query API, WebSocket live stream, AI endpoints
Dashboard dashboard 5173 React-based UI for browsing logs and management
Sample apps sample-app-requests n/a Example apps and test scripts

Quickstart (local)

  1. Start infrastructure:
docker compose up -d
  1. Apply DB migrations (creates tables and extensions):
# ensure PG_URL or DB_* env vars are set (see .env.example)
./db/migrate.sh
  1. (Optional) Seed a demo admin account and API key:
cd auth-service
npm install
npm run seed
  1. Launch services (convenience script):
chmod +x start-all.sh
./start-all.sh

Or run services individually and in foreground for development:

cd collector && npm install && npm run dev
cd auth-service && npm install && npm run dev
cd query-service && npm install && npm run dev
cd dashboard && npm install && npm run dev

Environment variables

Templates are provided per service. See:

Important variables

  • DB_* / PG_URL — Postgres connection
  • REDIS_URL — Redis connection string
  • REDIS_STREAM_KEY — Redis stream key used for logs
  • JWT_SECRET — auth service JWT signing secret
  • OPENAI_API_KEY — only required for AI features (summaries/embeddings)

Notable endpoints

  • Collector: POST /logs (requires X-API-Key or Authorization: Bearer <apiKey>)
  • Auth: POST /api/auth/signup, POST /api/auth/signin, POST /api/auth/keys, GET /api/auth/keys, DELETE /api/auth/keys/:id
  • Query: GET /logs, GET /logs/all, GET /logs/count, WS /ws
  • AI (Query service): POST /ai/summarize, POST /ai/semantic-search

AI features and privacy

The repository includes experimental AI features in query-service/ai:

  • Log summarization (/ai/summarize) — redacts common PII patterns before sending text to an external provider.
  • Semantic search (/ai/semantic-search) — uses embeddings and pgvector to rank similar logs.

PII protection is implemented as a best-effort regex-based redaction. For production, review and harden redaction rules and add an opt-in policy for external model usage.

Operational notes

  • The processor uses Redis consumer groups and a DLQ pattern for reliable ingestion; inspect processor/services/streamProcessor.js.
  • Rate limiting is enforced in the collector middleware (token-bucket style backed by Redis).
  • Migrations live under db/ and a simple runner db/migrate.sh is provided.

Troubleshooting

  • If db/migrate.sh fails creating the vector extension, use a Postgres image with pgvector installed or install the extension in your DB.
  • If logs are not landing in Postgres, confirm the processor is running and Redis is reachable.

Contributing

Contributions are welcome. For changes:

  1. Fork the repository, create a feature branch.
  2. Add tests for non-trivial logic.
  3. Open a PR describing the change, rationale, and testing steps.

License

This repository is provided for educational and demo purposes. Add or check a license file if you intend to publish or distribute.

About

instant_dev_logs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors