Skip to content

Crymzix/Sieve

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 

Repository files navigation

Sieve

Real-time data monitoring platform that turns natural language into intelligent alerts.

Overview

Sieve connects to live data streams and uses AI to generate production-grade Flink SQL from plain English descriptions. When your conditions are met, you receive real-time alerts via WebSocket or SSE.

Example: "Alert me when Bitcoin drops more than 5% in 10 minutes" → Sieve generates Flink SQL, deploys it to Confluent, and streams alerts to you.

Features

  • 17+ Preconfigured Data Sources — Crypto exchanges (Coinbase, Binance), flight trackers (OpenSky), transit systems (MTA, Citi Bike), financial data (SEC EDGAR, Yahoo Finance), and more
  • Natural Language to SQL — Describe what you want to monitor; AI generates the Flink SQL
  • Real-time Alerts — WebSocket and SSE delivery with sub-second latency
  • AI Summaries — Time-windowed summaries provide context and trends
  • Public Streaming API — Embed alerts in any application with simple SSE endpoints

Architecture

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│    Frontend     │────▶│    API Server   │────▶│  Confluent      │
│    (Next.js)    │     │    (NestJS)     │     │  Kafka + Flink  │
└─────────────────┘     └─────────────────┘     └─────────────────┘
                               │                        │
                               ▼                        ▼
                        ┌─────────────────┐     ┌─────────────────┐
                        │     Worker      │────▶│  Data Sources   │
                        │    (NestJS)     │     │  (17+ feeds)    │
                        └─────────────────┘     └─────────────────┘
  • API Server (port 1888) — HTTP endpoints, WebSocket/SSE, AI SQL generation
  • Worker (port 1889) — Connects to data sources, produces to Kafka
  • Single CodebaseAPP_MODE environment variable switches between API and Worker

Tech Stack

Backend:

  • NestJS 11, TypeScript
  • PostgreSQL (Neon) + Drizzle ORM
  • Redis (Upstash) for distributed coordination
  • Apache Kafka + Flink (Confluent Cloud)
  • Google Gemini AI for SQL generation

Frontend:

  • Next.js 16, React 19
  • Tailwind CSS 4
  • Zustand, TanStack Query
  • Socket.io client

Getting Started

Prerequisites

  • Bun (backend)
  • pnpm (frontend)
  • PostgreSQL database (or Neon account)
  • Redis instance (or Upstash account)
  • Confluent Cloud account (for Kafka + Flink)
  • Google AI API key (for Gemini)

Environment Variables

Create .env files in both backend/ and frontend/ directories:

backend/.env:

DATABASE_URL=postgresql://...
REDIS_URL=redis://...
CONFLUENT_API_KEY=...
CONFLUENT_API_SECRET=...
CONFLUENT_BOOTSTRAP_SERVERS=...
GOOGLE_AI_API_KEY=...
BETTER_AUTH_SECRET=...
APP_MODE=api  # or 'worker'
PORT=1888     # 1889 for worker

frontend/.env:

NEXT_PUBLIC_API_URL=http://localhost:1888

Installation

# Install backend dependencies
cd backend
bun install

# Install frontend dependencies
cd ../frontend
pnpm install

Development

Run all services locally:

# Terminal 1: API server
cd backend
bun run start:api

# Terminal 2: Worker
cd backend
bun run start:worker

# Terminal 3: Frontend
cd frontend
pnpm dev

Or run both backend modes together:

cd backend
bun run start:both

Database

cd backend

# Generate migrations
bun run db:generate

# Push schema to database
bun run db:push

# Open Drizzle Studio
bun run db:studio

Deployment

The backend is deployed on Render as two separate services:

  • API ServiceAPP_MODE=api
  • Worker ServiceAPP_MODE=worker

Both services use the same codebase and Docker image, differentiated only by environment variables.

API Endpoints

Sieves (Authenticated)

  • POST /api/sieves — Create a new sieve
  • GET /api/sieves — List your sieves
  • GET /api/sieves/:id — Get sieve details
  • DELETE /api/sieves/:id — Delete a sieve
  • POST /api/sieves/:id/start — Start monitoring
  • POST /api/sieves/:id/stop — Stop monitoring

Public Streams (No Auth)

  • GET /api/streams/:sieveId/alerts — SSE stream of alerts
  • GET /api/streams/:sieveId/summaries — SSE stream of AI summaries

Data Sources

  • GET /api/sources — List preconfigured sources
  • POST /api/sources/schema — Infer schema from a URL

Project Structure

sieve/
├── backend/
│   └── src/
│       ├── api/           # API mode modules
│       │   ├── sieves/    # Sieve CRUD
│       │   ├── streams/   # Public SSE endpoints
│       │   ├── alerts/    # WebSocket gateway
│       │   └── auth/      # Better Auth integration
│       ├── worker/        # Worker mode modules
│       │   ├── coordinator/   # Redis-based distribution
│       │   ├── connections/   # Data source connections
│       │   └── kafka/         # Kafka producer
│       ├── shared/        # Shared services
│       │   ├── adapters/      # WebSocket/SSE/Poll adapters
│       │   ├── handlers/      # Source-specific handlers
│       │   └── services/      # Schema inference, registry
│       └── database/      # Drizzle schema & config
├── frontend/
│   └── src/
│       └── app/           # Next.js app router
└── README.md

Releases

No releases published

Packages

No packages published

Languages