Skip to content

Latest commit

 

History

History
205 lines (148 loc) · 7.41 KB

File metadata and controls

205 lines (148 loc) · 7.41 KB

Databases — Redis HA Sentinel + CloudNativePG PostgreSQL

This document covers the two stateful data stores deployed in the lumen airgap cluster.


Redis HA Sentinel

Architecture

lumen namespace
  ├── redis-master-0      (node-1, PVC 1Gi)   ← writes
  ├── redis-replica-0     (node-2)             ← reads
  └── redis-sentinel-{0,1,2}                  ← leader election (quorum=2)
  • 1 master + 1 replica + 3 sentinels — tolerates 1 node failure
  • Sentinels use quorum=2 to elect a new master if the current one is unreachable
  • lumen-api connects via go-redis/v9 NewFailoverClient — discovers master automatically via sentinels

Connection (lumen-api)

REDIS_MODE=sentinel
REDIS_SENTINEL_ADDRS=redis-sentinel.lumen.svc.cluster.local:26379
REDIS_MASTER_NAME=mymaster

Failover behaviour

master pod dies
  → sentinels detect unavailability (within ~5s)
  → quorum reached (2/3 sentinels agree)
  → replica promoted to master
  → lumen-api reconnects automatically via sentinel
  → downtime < 10 seconds

Manifests


CloudNativePG — PostgreSQL Cluster

Why CloudNativePG?

CNPG is a Kubernetes-native operator (CNCF sandbox) that manages PostgreSQL clusters as CRDs. Key advantages over a plain StatefulSet:

  • Automatic master election and failover (Raft-based, quorum voting)
  • Two distinct services: -rw (master only) and -ro (replicas) for read/write splitting
  • Auto-generated credentials in a Secret (lumen-db-app)
  • Prometheus metrics via a manually managed PodMonitor (port 9187)
  • WAL archiving support for backups

Architecture

lumen namespace
  ├── lumen-db-N   (master  — lumen-db-rw → port 5432)
  ├── lumen-db-N   (replica — lumen-db-ro → port 5432)
  └── lumen-db-N   (witness — vote only, no data)

cnpg-system namespace
  └── cnpg-controller-manager   (operator)

Pod names use dynamic numbering (e.g. lumen-db-4, lumen-db-5). After a failover or PVC recreation the index increments — use kubectl get pods -n lumen -l cnpg.io/cluster=lumen-db to find the current primary.

Quorum:

3 instances → quorum = 2 → tolerates 1 failure
master fails → replica + witness vote → replica promoted → downtime < 30s

Why 1 witness instead of 3 full replicas?

With only 2 nodes (master + replica), if the master becomes unreachable, the replica can't know if it's truly dead or just partitioned — risk of split-brain (both claim to be master → data corruption). The witness is a lightweight 3rd voter (~50MB RAM, no data) that breaks the tie safely.

Services

Service Target Usage
lumen-db-rw Current master Writes (INSERT, UPDATE, DELETE)
lumen-db-ro Replicas Reads (SELECT)
lumen-db-r Any instance Internal CNPG use

Read/Write Splitting (lumen-api)

lumen-api maintains two separate connection pools:

// store/postgres.go
type PostgresStore struct {
    rw *pgxpool.Pool  // → lumen-db-rw:5432 (writes)
    ro *pgxpool.Pool  // → lumen-db-ro:5432  (reads)
}

Routes:

  • POST /items, DELETE /items/{id}rw pool
  • GET /items, GET /items/{id}ro pool

Connection (lumen-api)

Credentials are auto-generated by CNPG in Secret lumen-db-app:

# Deployment env vars
- name: PG_RW_DSN
  value: "postgresql://$(PG_USER):$(PG_PASSWORD)@lumen-db-rw.lumen.svc.cluster.local:5432/$(PG_DBNAME)"
- name: PG_RO_DSN
  value: "postgresql://$(PG_USER):$(PG_PASSWORD)@lumen-db-ro.lumen.svc.cluster.local:5432/$(PG_DBNAME)"

Failover behaviour

lumen-db-N (master) pod deleted
  → CNPG operator detects loss
  → replica + witness vote (quorum=2 ✅)
  → replica promoted to master
  → lumen-db-rw service endpoint updated automatically
  → lumen-api reconnects (pgxpool retries)
  → downtime < 30 seconds

Tested: deleting the primary pod triggers automatic promotion in ~30s.

Airgap deployment

Images used (from internal registry 192.168.2.2:5000):

  • 192.168.2.2:5000/cloudnative-pg:1.25.1 — operator
  • 192.168.2.2:5000/postgresql:16.6 — PostgreSQL instances

Install script: install-cnpg.sh

Known constraint with kube-router (k3s built-in NetworkPolicy controller): CNPG instance manager must reach the K8s API server (10.43.0.1:443 ClusterIP). kube-router does not evaluate ipBlock NetworkPolicy rules for ClusterIP destinations (traffic is rewritten by iptables DNAT before NetworkPolicy is evaluated). The workaround is to allow all TCP egress for CNPG pods in the allow-cnpg-intracluster NetworkPolicy.

namespaceSelector with kube-router: Use matchLabels rather than matchExpressions in namespaceSelector rules. kube-router (embedded in k3s) has known issues with matchExpressions + operator: In producing unexpected iptables ipsets. matchLabels with kubernetes.io/metadata.name is reliable and sufficient for single-namespace selection.

Monitoring

CNPG exposes Prometheus metrics on port 9187 (/metrics) via the built-in pg_exporter. Key metrics:

Metric Description
cnpg_pg_database_size_bytes Database size per instance
cnpg_backends_total Active connections
cnpg_pg_replication_in_recovery Whether instance is a replica
cnpg_pg_replication_lag Replication lag in seconds
cnpg_pg_database_xid_age XID age (wraparound risk)

PodMonitor: cnpg-lumen-db in the lumen namespace with label release: kube-prometheus-stack. The enablePodMonitor field in the Cluster spec is disabled — it is deprecated in CNPG v1.25+ and the operator-created PodMonitor lacks the label required by the Prometheus CR selector.

Grafana dashboard: "CloudNativePG" (ID 20417) — loaded automatically by the Grafana sidecar from the cnpg-grafana-dashboard ConfigMap.

Applying the dashboard ConfigMap (too large for standard kubectl apply):

kubectl apply --server-side -f 03-airgap-zone/manifests/cnpg/05-grafana-dashboard.yaml

Manifests

Useful commands

# Cluster status
kubectl get cluster -n lumen

# Pod distribution
kubectl get pods -n lumen -l cnpg.io/cluster=lumen-db -o wide

# Services
kubectl get svc -n lumen | grep lumen-db

# Credentials
kubectl get secret lumen-db-app -n lumen -o jsonpath='{.data.uri}' | base64 -d

# Connect directly (debug) — replace N with current primary index
kubectl exec -it lumen-db-N -n lumen -- psql -U app app

# Find current primary
kubectl get cluster lumen-db -n lumen -o jsonpath='{.status.currentPrimary}'

# Simulate failover
kubectl delete pod $(kubectl get cluster lumen-db -n lumen -o jsonpath='{.status.currentPrimary}') -n lumen
# Watch: kubectl get cluster -n lumen -w

# Check Prometheus targets (port 9187)
kubectl exec -n monitoring prometheus-kube-prometheus-stack-prometheus-0 -c prometheus -- \
  wget -qO- "http://localhost:9090/api/v1/targets" | python3 -c "
import sys,json; d=json.load(sys.stdin)
[print(t['scrapeUrl'], t['health']) for t in d['data']['activeTargets'] if '9187' in t.get('scrapeUrl','')]
"