Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,12 @@ cp .env.example .env
4. Start Docker services (PostgreSQL, Redis, ClickHouse):

```bash
docker-compose up -d
docker compose up -d
```

> **Note:** This starts the **development** infrastructure only (`docker-compose.yaml`).
> For self-hosting with all application services, use `docker compose -f docker-compose.selfhost.yml up -d` instead — see the [Self-Hosting section](README.md#-self-hosting) in the README.

5. Set up the database:

```bash
Expand Down
32 changes: 32 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,38 @@ A comprehensive analytics and data management platform built with Next.js, TypeS
- Bun 1.3.4+
- Node.js 20+

## 🏠 Self-Hosting

Databuddy can be self-hosted using Docker Compose. The repo includes two compose files:

| File | Purpose |
|---|---|
| `docker-compose.yaml` | **Development only** — starts infrastructure (Postgres, ClickHouse, Redis) for local dev |
| `docker-compose.selfhost.yml` | **Production / self-hosting** — full stack with all application services from GHCR images |

### Quick Start

```bash
# 1. Configure environment
cp .env.example .env
# Edit .env — at minimum set BETTER_AUTH_SECRET and BETTER_AUTH_URL

# 2. Start everything
docker compose -f docker-compose.selfhost.yml up -d

# 3. Initialize databases (first run only)
docker compose -f docker-compose.selfhost.yml exec api bun run db:push
docker compose -f docker-compose.selfhost.yml exec api bun run clickhouse:init
```

Services started:
- **API** → `localhost:3001`
- **Basket** (event ingestion) → `localhost:4000`
- **Links** (short links) → `localhost:2500`
- **Uptime** monitoring is optional — uncomment in the compose file and set QStash keys.

All ports are configurable via env vars (`API_PORT`, `BASKET_PORT`, etc.). See the compose file comments for the full env var reference.

## 🤝 Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
Expand Down
205 changes: 205 additions & 0 deletions docker-compose.selfhost.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
# ─────────────────────────────────────────────────────────────────────
# Databuddy · Self-Hosting Docker Compose
# ─────────────────────────────────────────────────────────────────────
#
# Usage:
# 1. Copy .env.example → .env and fill in your values
# ⚠ IMPORTANT: Change DB_PASSWORD, REDIS_PASSWORD, and
# CLICKHOUSE_PASSWORD before deploying to production!
# 2. docker compose -f docker-compose.selfhost.yml up -d
# 3. Initialize databases (first run only):
# docker compose -f docker-compose.selfhost.yml exec api bun run db:push
# docker compose -f docker-compose.selfhost.yml exec api bun run clickhouse:init
#
# Images: ghcr.io/databuddy-analytics/databuddy-{api,basket,links,uptime}
# ─────────────────────────────────────────────────────────────────────

services:

# ── Infrastructure ───────────────────────────────────────────────────
# Ports are bound to 127.0.0.1 (localhost only) for security.
# App services reach them via the internal Docker network.
# Remove the 127.0.0.1 prefix if you need external access.

postgres:
image: postgres:17-alpine
container_name: databuddy-postgres
environment:
POSTGRES_DB: databuddy
POSTGRES_USER: databuddy
POSTGRES_PASSWORD: ${DB_PASSWORD:-changeme}
ports:
- "127.0.0.1:${POSTGRES_PORT:-5432}:5432"
volumes:
- postgres_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U databuddy -d databuddy"]
interval: 10s
timeout: 5s
retries: 5
restart: unless-stopped
networks:
- databuddy

clickhouse:
image: clickhouse/clickhouse-server:25.5.1-alpine
container_name: databuddy-clickhouse
environment:
CLICKHOUSE_DB: databuddy_analytics
CLICKHOUSE_USER: default
CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD:-changeme}
CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT: 1
ports:
- "127.0.0.1:${CLICKHOUSE_PORT:-8123}:8123"
volumes:
- clickhouse_data:/var/lib/clickhouse
ulimits:
nofile:
soft: 262144
hard: 262144
healthcheck:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Missing scripts/clickhouse-init.sql breaks ClickHouse init mount

The volume ./scripts/clickhouse-init.sql:/docker-entrypoint-initdb.d/clickhouse-init.sql references a file that does not exist in the repository (confirmed: no scripts/ directory is tracked in git). When Docker encounters a bind-mount where the host path is missing, it creates an empty directory at that path instead of a file. ClickHouse's init entrypoint then sees a directory named clickhouse-init.sql rather than a SQL file and silently skips or errors on it.

The same line exists in docker-compose.yaml (dev), so this appears to be a copy from there without the file ever being committed. Since the README correctly tells users to run bun run clickhouse:init via the API for first-run initialization, the cleanest fix is to remove this volume bind-mount from the selfhost compose to avoid the misleading/broken entry:

Suggested change
healthcheck:
- clickhouse_data:/var/lib/clickhouse

If the intent is to seed ClickHouse automatically on first start, the SQL file needs to be created and committed to scripts/clickhouse-init.sql.

test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:8123/ping"]
interval: 10s
timeout: 5s
retries: 5
restart: unless-stopped
networks:
- databuddy

redis:
Comment on lines +67 to +69
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Redis exposed to the host without authentication

The Redis service publishes port 6379 to the host (0.0.0.0:6379) with no --requirepass or ACL configuration. In a production self-hosted deployment where the host has a public IP (e.g., a VPS), this means Redis is accessible to the internet without credentials.

Consider either:

  1. Removing the ports mapping for Redis (it only needs to be reachable inside the Docker network by the app services), or
  2. Adding a password via --requirepass ${REDIS_PASSWORD} and updating all REDIS_URL env vars to include the password.

If the port is needed for local debug access, document the security risk clearly in the compose file comments.

image: redis:7-alpine
container_name: databuddy-redis
ports:
- "127.0.0.1:${REDIS_PORT:-6379}:6379"
volumes:
- redis_data:/data
command: >
redis-server
--appendonly yes
--maxmemory 512mb
--maxmemory-policy noeviction
--requirepass ${REDIS_PASSWORD:-changeme}
healthcheck:
test: ["CMD", "redis-cli", "-a", "${REDIS_PASSWORD:-changeme}", "ping"]
interval: 10s
timeout: 5s
retries: 5
restart: unless-stopped
networks:
- databuddy

# ── Application Services ─────────────────────────────────────────────
#
# Note: api uses bun:slim, basket/links/uptime use distroless images with
# no shell, wget, or curl. Container-level healthchecks are omitted for
# app services. Monitor /health endpoints externally (reverse proxy, etc.).

api:
image: ghcr.io/databuddy-analytics/databuddy-api:latest
container_name: databuddy-api
ports:
- "${API_PORT:-3001}:3001"
environment:
NODE_ENV: production
PORT: "3001"
DATABASE_URL: postgres://databuddy:${DB_PASSWORD:-changeme}@postgres:5432/databuddy
REDIS_URL: redis://:${REDIS_PASSWORD:-changeme}@redis:6379
CLICKHOUSE_URL: http://default:${CLICKHOUSE_PASSWORD:-changeme}@clickhouse:8123/databuddy_analytics
BETTER_AUTH_URL: ${BETTER_AUTH_URL:?Set BETTER_AUTH_URL to your dashboard public URL}
BETTER_AUTH_SECRET: ${BETTER_AUTH_SECRET:?Set BETTER_AUTH_SECRET (openssl rand -base64 32)}
DASHBOARD_URL: ${DASHBOARD_URL:-}
AI_API_KEY: ${AI_API_KEY:-}
RESEND_API_KEY: ${RESEND_API_KEY:-}
depends_on:
postgres:
condition: service_healthy
clickhouse:
condition: service_healthy
redis:
condition: service_healthy
restart: unless-stopped
networks:
- databuddy

basket:
image: ghcr.io/databuddy-analytics/databuddy-basket:latest
container_name: databuddy-basket
ports:
- "${BASKET_PORT:-4000}:4000"
environment:
NODE_ENV: production
PORT: "4000"
DATABASE_URL: postgres://databuddy:${DB_PASSWORD:-changeme}@postgres:5432/databuddy
REDIS_URL: redis://:${REDIS_PASSWORD:-changeme}@redis:6379
CLICKHOUSE_URL: http://default:${CLICKHOUSE_PASSWORD:-changeme}@clickhouse:8123/databuddy_analytics
# SELFHOST=true → basket writes directly to ClickHouse (no Kafka/Redpanda needed)
SELFHOST: "true"
depends_on:
postgres:
condition: service_healthy
clickhouse:
Comment on lines +135 to +140
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 basket service missing CLICKHOUSE_USER and CLICKHOUSE_PASSWORD env vars

The api service explicitly sets both CLICKHOUSE_USER and CLICKHOUSE_PASSWORD as separate env vars (in addition to embedding them in CLICKHOUSE_URL). The basket service only sets CLICKHOUSE_URL and omits these individual vars. If basket's ClickHouse client reads the credentials from individual env vars (as many Node ClickHouse clients can), it will fall back to unauthenticated access or use incorrect credentials once a CLICKHOUSE_PASSWORD is set.

For consistency with the api service, the basket service should also declare CLICKHOUSE_USER and CLICKHOUSE_PASSWORD as separate environment variables, mirroring the pattern in the api service block.

condition: service_healthy
redis:
condition: service_healthy
restart: unless-stopped
networks:
- databuddy

# Note: links service hardcodes port 2500 internally (not configurable via env var)
links:
image: ghcr.io/databuddy-analytics/databuddy-links:latest
container_name: databuddy-links
ports:
- "${LINKS_PORT:-2500}:2500"
environment:
NODE_ENV: production
DATABASE_URL: postgres://databuddy:${DB_PASSWORD:-changeme}@postgres:5432/databuddy
REDIS_URL: redis://:${REDIS_PASSWORD:-changeme}@redis:6379
# APP_URL: public URL of your dashboard — used for expired/not-found link redirect pages
APP_URL: ${APP_URL:?Set APP_URL to your dashboard public URL (e.g. https://app.example.com)}
LINKS_ROOT_REDIRECT_URL: ${LINKS_ROOT_REDIRECT_URL:-https://databuddy.cc}
# GEOIP_DB_URL: fetches MaxMind GeoLite2-City DB on startup from this URL.
# Defaults to cdn.databuddy.cc — override with your own hosted copy to avoid the external dependency.
GEOIP_DB_URL: ${GEOIP_DB_URL:-https://cdn.databuddy.cc/mmdb/GeoLite2-City.mmdb}
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
restart: unless-stopped
networks:
- databuddy

# ── Optional: Uptime Monitoring ──────────────────────────────────────
# Requires Upstash QStash. Uncomment and set QSTASH keys to enable.
# Port mapped to 4001 externally to avoid conflict with basket (both use 4000 internally).
#
# uptime:
# image: ghcr.io/databuddy-analytics/databuddy-uptime:latest
# container_name: databuddy-uptime
# ports:
# - "${UPTIME_PORT:-4001}:4000"
# environment:
# NODE_ENV: production
# DATABASE_URL: postgres://databuddy:${DB_PASSWORD:-changeme}@postgres:5432/databuddy
# REDIS_URL: redis://:${REDIS_PASSWORD:-changeme}@redis:6379
# QSTASH_CURRENT_SIGNING_KEY: ${QSTASH_CURRENT_SIGNING_KEY}
# QSTASH_NEXT_SIGNING_KEY: ${QSTASH_NEXT_SIGNING_KEY}
# RESEND_API_KEY: ${RESEND_API_KEY:-}
# depends_on:
# postgres:
# condition: service_healthy
# redis:
# condition: service_healthy
# restart: unless-stopped
# networks:
# - databuddy

volumes:
postgres_data:
clickhouse_data:
redis_data:

networks:
databuddy:
driver: bridge
Loading