Skip to content

0x177630b6/atnine-guard

Repository files navigation

atnine-guard

DeFi Portfolio Guardian — real-time monitoring, risk alerting, and decision orchestration for multi-chain DeFi positions.


Table of Contents


System Overview

atnine-guard is a Node.js backend system that continuously monitors DeFi portfolio positions across multiple blockchains, evaluates configurable risk rules against live on-chain data, and dispatches alerts to stakeholders via Telegram. It supports multi-tenant isolation through a file-based vault and coordinates work across three independent processes using Redis.

┌──────────────────────────────────────────────────────────────────┐
│                        atnine-guard                              │
│                                                                  │
│   ┌───────────┐     ┌─────────────┐     ┌──────────────┐        │
│   │    API     │     │  Scheduler  │     │    Worker     │        │
│   │  (Fastify) │     │  (5s tick)  │     │   (job loop)  │        │
│   └─────┬─────┘     └──────┬──────┘     └──────┬───────┘        │
│         │                  │                    │                │
│         │          ┌───────▼────────┐           │                │
│         │          │     Redis      │◄──────────┘                │
│         │          │  (job queue +  │                            │
│         │          │   locks +      │                            │
│         │          │   state)       │                            │
│         │          └────────────────┘                            │
│         │                                                        │
│         ▼                                                        │
│   ┌─────────────────────────────────────────┐                    │
│   │              Vault (filesystem)          │                    │
│   │  vault/tenants/TEN-*/   vault/public/    │                    │
│   └─────────────────────────────────────────┘                    │
│                                                                  │
│   ┌──────────┐  ┌──────────┐  ┌───────────┐  ┌──────────┐      │
│   │ Ethereum │  │   BSC    │  │  Osmosis  │  │ Zigchain │ ...   │
│   └──────────┘  └──────────┘  └───────────┘  └──────────┘      │
└──────────────────────────────────────────────────────────────────┘

Architecture

Process Model

The system runs as three independent OS processes that communicate exclusively through Redis. Each can be scaled, restarted, or deployed independently.

graph LR
    subgraph Processes
        API["API Server<br/><i>Fastify · port 3000</i>"]
        SCH["Scheduler<br/><i>5s tick loop</i>"]
        WRK["Worker Pool<br/><i>blocking dequeue</i>"]
    end

    subgraph Redis
        Q["q:jobs<br/>(normal queue)"]
        PQ["q:jobs:priority<br/>(critical queue)"]
        DLQ["q:jobs:dlq<br/>(dead-letter)"]
        LOCK["lock:*<br/>(distributed locks)"]
        STATE["refresh:ts:*<br/>delta:hash:*<br/>heartbeat:*"]
    end

    subgraph Storage
        VAULT["Vault<br/>(filesystem)"]
    end

    SCH -- "enqueue jobs" --> Q
    SCH -- "enqueue critical" --> PQ
    SCH -- "write state" --> STATE
    WRK -- "dequeue (BLPOP)" --> Q
    WRK -- "dequeue (LPOP)" --> PQ
    WRK -- "failed jobs" --> DLQ
    WRK -- "acquire/release" --> LOCK
    WRK -- "read/write positions,<br/>incidents, decisions" --> VAULT
    API -- "read" --> VAULT
    API -- "cache reads" --> STATE
Loading
Process Entry point Role
API Server src/api/server.js Fastify v5 HTTP server. Serves portfolio data, incidents, decisions, playbooks. Exposes /health and /metrics (Prometheus). All data routes require Telegram-based authentication.
Scheduler src/scheduler/main.js Ticks every 5 seconds. Scans tenant directories, determines which tenants are due for a refresh based on tier cadence, and enqueues jobs. Manages CEX mid-price refresh (every 60s).
Worker Pool src/workers/main.js Blocking dequeue loop. Processes six job types under distributed locks with heartbeat-extended TTLs. Retries transient errors up to 3 times; sends fatal failures to the dead-letter queue.

Data Flow

sequenceDiagram
    participant S as Scheduler
    participant R as Redis Queue
    participant W as Worker
    participant C as Blockchain RPCs
    participant V as Vault (filesystem)
    participant T as Telegram

    loop Every 5 seconds
        S->>S: Scan vault/tenants/ for TEN-* dirs
        S->>S: Compute tenant config hash (SHA-256)
        S->>R: Compare delta hash + check cadence
        alt Tenant is due OR config changed
            S->>R: Enqueue SYNC_BALANCES
            S->>R: Enqueue SYNC_PROTOCOL_POSITIONS
            S->>R: Enqueue EVAL_RULES
        end
    end

    loop Every 60 seconds
        S->>R: Enqueue FETCH_CEX_MIDS
    end

    loop Worker dequeue loop
        W->>R: BLPOP q:jobs:priority, q:jobs (5s timeout)
        R-->>W: Job payload
        W->>R: Acquire lock (SET NX PX 60000)

        alt SYNC_BALANCES / SYNC_PROTOCOL_POSITIONS
            W->>C: Fetch on-chain data via adapters
            C-->>W: Balances / positions
            W->>V: Write vault/.../20_positions/latest.json
        end

        alt EVAL_RULES
            W->>V: Read positions + CEX mids
            W->>W: Run rule evaluators (drift, out-of-range, oracle, RPC health)
            W->>V: Append incidents to 30_alerts/incidents/{date}.md
            alt Urgent incident detected
                W->>R: Enqueue DISPATCH_ALERTS (critical priority)
            end
        end

        alt DISPATCH_ALERTS
            W->>V: Read tenant members
            W->>T: Send Telegram notifications (rate-limited, deduped)
        end

        alt FETCH_CEX_MIDS
            W->>C: Fetch reference prices from CEX APIs
            W->>V: Write vault/public/market/cex_mids_latest.json
        end

        W->>R: Release lock (Lua atomic compare-and-delete)
    end
Loading

Redis Communication Layer

Redis serves four roles. No other inter-process communication mechanism is used.

Role Keys Mechanism
Job Queue q:jobs (normal), q:jobs:priority (critical), q:jobs:dlq (dead-letter) Redis lists. Workers dequeue via BLPOP with 5s timeout. Priority queue checked first via non-blocking LPOP.
Distributed Locks lock:{scope} SET key uuid PX ttl NX. Released atomically via Lua script that checks ownership. Workers extend lock TTL every 10s via heartbeat (PEXPIRE). Max job runtime enforced at 360s.
Scheduling State refresh:ts:{tenantId}, delta:hash:{tenantId}, refresh:ts:cex_mids Tracks last refresh timestamps and config hashes for delta detection.
Heartbeats heartbeat:scheduler, heartbeat:worker:{pid} Written every tick/loop iteration with 30s TTL. Absence signals a dead process.

Connection config: Single ioredis client with automatic retry (exponential backoff, max 5s delay). URL from REDIS_URL env var (default redis://127.0.0.1:6379).

Adapter Layer

Blockchain access is abstracted behind a registry of chain and protocol adapters. Adapters are lazy-loaded on first use and cached for the process lifetime.

graph TD
    REG["Adapter Registry<br/><code>src/adapters/registry.js</code>"]

    subgraph Chain Adapters
        EVM["EVM Adapter<br/><i>Ethereum, BSC, Base</i><br/>Multicall batching"]
        COSMOS["Cosmos Adapter<br/><i>Zigchain, Osmosis</i><br/>LCD REST API"]
    end

    subgraph Protocol Adapters
        UV2["UniV2<br/><i>DEX V2 LPs</i>"]
        UV3["UniV3<br/><i>Concentrated liquidity</i>"]
        OSCL["OsmosisCL<br/><i>Osmosis CL pools</i>"]
        MARS["Mars Protocol<br/><i>Cosmos lending</i>"]
        AAVE["Aave V3<br/><i>EVM lending</i>"]
    end

    subgraph Resilience
        FB["RPC Fallback Manager<br/><i>Primary → Secondary</i>"]
        CB["Circuit Breaker<br/><i>CLOSED → OPEN → HALF_OPEN</i>"]
    end

    REG --> EVM
    REG --> COSMOS
    REG --> UV2
    REG --> UV3
    REG --> OSCL
    REG --> MARS
    REG --> AAVE
    EVM --> FB
    COSMOS --> FB
    FB --> CB
Loading

RPC Fallback: Each chain has a primary and secondary RPC URL. The circuit breaker monitors failures per chain scope. When the failure count hits the threshold (default 5), the circuit opens and the fallback manager switches to the secondary endpoint. After a cooldown (60s), the circuit enters half-open state and tests the primary with up to 3 probe requests. Three consecutive successes close the circuit and restore the primary.

Cost Tracking: Every RPC call is counted per chain with a configurable costPerCall. Daily cost reports are written to vault/public/metrics/.

Rule Engine

Located in src/engine/rules/. Each rule evaluator receives the tenant's current positions and reference data, and returns an array of incidents.

graph LR
    POS["Positions<br/>(from vault)"] --> EVAL["Rule Evaluator"]
    CEX["CEX Mid-Prices<br/>(from vault/public)"] --> EVAL
    THR["Thresholds<br/>(per-tenant + defaults)"] --> EVAL

    EVAL --> R1["dexOutOfRange<br/><i>current_tick outside<br/>lower..upper range</i><br/>Severity: URGENT"]
    EVAL --> R2["dexDrift<br/><i>DEX price vs CEX<br/>reference deviation</i>"]
    EVAL --> R3["oracleDrift<br/><i>Stale or divergent<br/>price feeds</i>"]
    EVAL --> R4["rpcHealth<br/><i>RPC endpoint<br/>degradation</i>"]

    R1 --> INC["Incidents"]
    R2 --> INC
    R3 --> INC
    R4 --> INC
Loading

Thresholds (src/engine/thresholds.js): Per-tenant overrides stored in the vault. Falls back to system defaults when tenant-specific config is absent.

Playbooks (src/engine/playbooks.js): Ordered response steps triggered by incident types. Loaded from config/default-playbooks.json with per-tenant customization support. Playbook types:

  • DEX Out of Range Response
  • DEX Drift Response
  • Lending Health Low Response
  • Oracle Drift Response
  • RPC Degraded Response

Vault (Persistent Storage)

All persistent data is stored as JSON and Markdown files on the local filesystem. No external database is required.

vault/
├── tenants/
│   └── TEN-{tenantId}/
│       ├── 00_meta/
│       │   ├── tenant.json           # tenant_id, refresh_interval_sec, telegram_user_id
│       │   └── members.json          # team members with roles (admin/operator/viewer)
│       ├── 10_wallets/
│       │   └── wallets.json          # tracked wallet addresses per chain
│       ├── 20_positions/
│       │   ├── latest.json           # current positions (written by SYNC jobs)
│       │   └── posture/latest.json   # portfolio summary (NAV, risk score, chain breakdown)
│       ├── 30_alerts/
│       │   ├── incidents/{date}.md   # daily incident logs (append-only)
│       │   └── decisions/            # logged alert decisions
│       ├── 40_policies/              # risk policies
│       ├── 50_decisions/             # pending and executed decisions
│       │   └── latest.json
│       ├── 60_proposals/             # playbook proposals
│       ├── 70_reports/               # generated reports
│       └── 80_agent/                 # agent state
│
└── public/
    ├── market/
    │   └── cex_mids_latest.json      # latest CEX reference prices
    ├── health/                       # chain health metrics
    └── incidents/                    # public incident logs

Write safety: All JSON writes use atomic write (write to .tmp, then rename) to prevent corruption from crashes. Path traversal is blocked at the tenantPath() and publicPath() functions via null-byte detection, .. segment rejection, and resolve() boundary checks.

Resilience Patterns

graph TD
    subgraph "Failure Handling Pipeline"
        JOB["Job Execution"] --> |success| DONE["Complete"]
        JOB --> |TransientError| RETRY{"attempts < 3?"}
        RETRY --> |yes| REQUEUE["Re-enqueue<br/>attempts + 1"]
        RETRY --> |no| DLQ["Dead-Letter Queue"]
        JOB --> |FatalError| DLQ
        JOB --> |timeout > 360s| DLQ
        JOB --> |unknown type| DLQ
    end

    subgraph "Circuit Breaker States"
        CLOSED["CLOSED<br/><i>all calls pass through</i>"]
        CLOSED --> |"failures ≥ 5"| OPEN["OPEN<br/><i>calls rejected immediately</i>"]
        OPEN --> |"60s cooldown"| HALF["HALF_OPEN<br/><i>probe requests (max 3)</i>"]
        HALF --> |"3 consecutive successes"| CLOSED
        HALF --> |"any failure"| OPEN
    end

    subgraph "Lock Safety"
        ACQ["Acquire Lock<br/><code>SET NX PX 60000</code>"]
        HB["Heartbeat<br/><code>PEXPIRE</code> every 10s"]
        REL["Release Lock<br/><i>Lua: compare owner UUID<br/>then DEL</i>"]
        ACQ --> HB --> REL
    end
Loading
Pattern Implementation Parameters
Retry with backoff TransientError triggers re-enqueue Max 3 attempts
Dead-letter queue FatalError or max retries exceeded Inspect via scripts/dlq.js
Circuit breaker Per-chain scope, 3-state FSM 5 failures to open, 60s reset, 3 successes to close
RPC failover Primary/secondary endpoint switching Triggered by circuit breaker OPEN state
Distributed locks Redis SET NX with Lua release 60s TTL, 10s heartbeat, 360s max runtime
Rate limiting Token bucket (API + Telegram) Configurable tokens/interval
Cold-start stagger Scheduler delays initial enqueues 50 RPC calls/sec max on first tick
Jitter ±20% randomization on cadence Prevents thundering herd

API Surface

Base URL: http://localhost:3000 (configurable via API_PORT)

Method Path Auth Description
GET /health None Liveness check. Returns { status, checks: { redis, vault } }. Returns 503 if degraded.
GET /metrics None Prometheus metrics in text exposition format.
GET /posture Telegram Portfolio summary: NAV, risk score, chain allocation.
GET /positions Telegram Detailed position list across all chains/protocols.
GET /decisions Telegram Pending and executed decisions.
GET /incidents Telegram Incident log for the tenant.
GET /thresholds Telegram Risk thresholds (tenant-specific or defaults).
GET /devices Telegram Registered device fingerprints.
GET /exports Telegram Data export endpoints.
GET /playbooks Telegram Configured automation playbooks.
GET /benchmarks Telegram Portfolio benchmark comparisons.

Middleware pipeline (applied in order for authenticated routes):

  1. telegramAuthMiddleware — validates Telegram bot API credentials, extracts user ID
  2. deviceBindingMiddleware — binds and verifies device fingerprints (requires Redis)
  3. rateLimitMiddleware — token-bucket rate limiting per user
  4. tenantScopeMiddleware — resolves tenant from user ID, enforces isolation

Response caching: Read-through cache backed by Redis with configurable TTLs per route. Falls back gracefully if Redis is unavailable.

Telegram Integration

Located in src/telegram/. Handles all user-facing notifications.

Module Purpose
bot.js Bot initialization, polling setup
send.js Single message sending with error handling
broadcast.js Multi-recipient message dispatch
commands.js Bot command handlers
rateLimiter.js Per-user send rate limiting
cooldown.js Cooldown tracking between alerts
dedup.js Prevents duplicate alert delivery
batcher.js Batches multiple messages into consolidated sends
acknowledge.js Tracks message acknowledgment status

Security Model

  • Authentication: Telegram bot API-based. User identity derived from Telegram user ID.
  • Authorization: Role-based — admin, operator, viewer per tenant.
  • Tenant isolation: Each tenant has an independent vault directory (TEN-{id}). Path traversal blocked at the vault path layer.
  • Device binding: Optional device fingerprint binding stored in Redis. Prevents session hijacking.
  • Advisor sandbox (src/advisor/sandbox.js): LLM-assisted analysis runs in a restricted context with glob-pattern file whitelisting, shell injection detection (;, |, &, `, $(), ${}), rate limits, and prompt length caps.
  • Secrets: .env file should be chmod 600. Never committed (in .gitignore).

Supported Chains & Protocols

Chain Type RPC Multicall Protocols
Ethereum EVM eth.llamarpc.com 0xcA11bde...CA11 UniV2, UniV3, Aave V3
BSC EVM bsc-dataseed.binance.org 0xcA11bde...CA11
Base EVM mainnet.base.org 0xcA11bde...CA11
Zigchain Cosmos (LCD) rpc.zigchain.com N/A Mars (lending)
Osmosis Cosmos (LCD) rpc.osmosis.zone N/A OsmosisCL (concentrated liquidity)

All chains have secondary/fallback RPC endpoints configured in config/chains.json.


Environment Variables

Variable Required Default Description
REDIS_URL No redis://127.0.0.1:6379 Redis connection string
TELEGRAM_BOT_TOKEN Yes Telegram Bot API token (from @BotFather)
VAULT_ROOT No ./vault Filesystem path for vault storage
NODE_ENV No development development / production
API_PORT No 3000 HTTP port for the API server
LOG_LEVEL No info Pino log level (trace, debug, info, warn, error, fatal)

Copy .env.example to .env and fill in the values:

cp .env.example .env

Mockup / Local Development Setup

Prerequisites

  • Node.js ≥ 20 (see .nvmrc)
  • Docker (for Redis)
  • A Telegram Bot Token (optional for local testing without alerts)

Steps

# 1. Clone and install
git clone <repo-url> && cd atnine-guard
nvm use            # switches to Node 20 via .nvmrc
npm install

# 2. Start Redis
docker compose up -d

# 3. Configure environment
cp .env.example .env
# Edit .env — at minimum set TELEGRAM_BOT_TOKEN for alert testing

# 4. Seed the vault with mock data
npm run seed-vault
# Creates:
#   - Tenant "seed-001" with 3 members (admin/operator/viewer)
#   - 2 mock wallets (Ethereum EOA + Cosmos)
#   - Mock positions (UniV3 LP + Mars lending)
#   - Mock posture (NAV, risk score)
#   - Sample incident log
#   - Pending rebalance decision
#   - CEX mid-prices for BTC, ETH, USDC, USDT, DAI

# 5. Start all three processes in watch mode
npm run dev
# Runs concurrently with --watch:
#   - API server     → http://localhost:3000
#   - Scheduler      → enqueuing jobs every 5s
#   - Worker         → processing jobs

# 6. Verify
curl http://localhost:3000/health
# → { "status": "ok", "checks": { "redis": "ok", "vault": "ok" } }

curl http://localhost:3000/metrics
# → Prometheus text format metrics

Running individual processes

npm run start:api        # API only
npm run start:scheduler  # Scheduler only
npm run start:worker     # Worker only

Code quality

npm run lint             # ESLint (import ordering, strict equality, prefer-const)
npm run format           # Prettier (single quotes, semicolons, 100 char width)

Pre-commit hooks (via Husky + lint-staged) run eslint --fix and prettier --write automatically on staged .js files.


Production Deployment

Target architecture

graph TB
    subgraph "VPS (Ubuntu/Debian)"
        subgraph "systemd services"
            API["defi-api.service<br/><code>node src/api/server.js</code>"]
            SCH["defi-scheduler.service<br/><code>node src/scheduler/main.js</code>"]
            WRK["defi-worker.service<br/><code>node src/workers/main.js</code>"]
        end

        REDIS["redis.service<br/><i>bound to 127.0.0.1</i>"]
        VAULT["Vault directory<br/><code>/opt/atnine-guard/vault/</code>"]
        ENV[".env<br/><code>chmod 600</code>"]

        API --> REDIS
        SCH --> REDIS
        WRK --> REDIS
        API --> VAULT
        WRK --> VAULT
    end

    subgraph "External"
        TG["Telegram API"]
        RPC["Blockchain RPCs<br/>(Ethereum, BSC, Base,<br/>Osmosis, Zigchain)"]
    end

    WRK --> TG
    WRK --> RPC
    API --> TG
Loading

Server Preparation

# Create service user (no login shell)
sudo useradd --system --create-home --shell /usr/sbin/nologin defi

# Install Node.js 20
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt-get install -y nodejs

# Install Redis
sudo apt-get install -y redis-server

# Firewall
sudo ufw allow 22/tcp    # SSH
sudo ufw allow 3000/tcp  # API (or put behind reverse proxy)
sudo ufw enable

# SSH hardening (edit /etc/ssh/sshd_config)
# PasswordAuthentication no
# PermitRootLogin no
sudo systemctl restart sshd

# Optional: fail2ban + unattended-upgrades
sudo apt-get install -y fail2ban unattended-upgrades
sudo systemctl enable fail2ban

Redis Hardening

sudo vim /etc/redis/redis.conf

# Bind to localhost only
bind 127.0.0.1

# Set a password (update REDIS_URL in .env accordingly)
requirepass <strong-password>

# Disable dangerous commands
rename-command FLUSHALL ""
rename-command FLUSHDB ""
rename-command DEBUG ""

sudo systemctl restart redis

If using a Redis password, update REDIS_URL:

REDIS_URL=redis://:your-password@127.0.0.1:6379

Application Deployment

# Deploy code
sudo mkdir -p /opt/atnine-guard
sudo chown defi:defi /opt/atnine-guard
sudo -u defi git clone <repo-url> /opt/atnine-guard
cd /opt/atnine-guard
sudo -u defi npm ci --production

# Configure secrets
sudo -u defi cp .env.example .env
sudo -u defi vim .env          # Set TELEGRAM_BOT_TOKEN, REDIS_URL, NODE_ENV=production
sudo chmod 600 /opt/atnine-guard/.env

# Initialize vault
sudo -u defi npm run seed-vault

# Set vault permissions
sudo chown -R defi:defi /opt/atnine-guard/vault
sudo chmod -R 700 /opt/atnine-guard/vault

systemd Services

Three unit files are provided in systemd/. All run as the defi user, restart on failure (5s delay), and depend on redis.service.

# Install service files
sudo cp systemd/defi-api.service /etc/systemd/system/
sudo cp systemd/defi-scheduler.service /etc/systemd/system/
sudo cp systemd/defi-worker.service /etc/systemd/system/

# Reload and enable
sudo systemctl daemon-reload
sudo systemctl enable defi-api defi-scheduler defi-worker

# Start all services
sudo systemctl start defi-api defi-scheduler defi-worker

# Check status
sudo systemctl status defi-api defi-scheduler defi-worker

Service details:

Service Unit file ExecStart Depends on
defi-api systemd/defi-api.service node src/api/server.js network.target, redis.service
defi-scheduler systemd/defi-scheduler.service node src/scheduler/main.js network.target, redis.service
defi-worker systemd/defi-worker.service node src/workers/main.js network.target, redis.service

All services use EnvironmentFile=/opt/atnine-guard/.env and log to the systemd journal:

# View logs
sudo journalctl -u defi-api -f
sudo journalctl -u defi-scheduler -f
sudo journalctl -u defi-worker -f

# View all services combined
sudo journalctl -u defi-api -u defi-scheduler -u defi-worker --since "1 hour ago"

Hardening Verification

Run the included verification script to check security posture:

sudo bash scripts/verify-hardening.sh

This checks:

  • SSH: password auth disabled, root login disabled
  • Firewall: UFW active
  • Redis: bound to localhost, password set
  • Process security: Node.js not running as root
  • Secrets: .env permissions are 600
  • Services: fail2ban running, unattended-upgrades enabled

Backup & Restore

# Run a vault backup
node scripts/backup.js run [--backup-dir /path/to/backups]

# List existing backups
node scripts/backup.js list [--backup-dir /path/to/backups]

# Prune old backups (default: 30 day retention)
node scripts/backup.js prune [--retention-days 30]

# Restore from backup
node scripts/backup.js restore <backup-path> <target-path>

Monitoring & Observability

Health endpoint:

# Returns 200 if healthy, 503 if degraded
curl http://localhost:3000/health

Prometheus metrics at /metrics:

Metric Type Labels Description
jobs_processed_total Counter type, status Total jobs processed
job_duration_seconds Histogram type Job execution duration
queue_depth Gauge queue Current queue depth
rpc_requests_total Counter chain, method RPC calls made
rpc_latency_seconds Histogram chain RPC call latency
alerts_sent_total Counter channel Alerts dispatched
circuit_breaker_state Gauge scope Circuit breaker state (0=closed, 1=open, 2=half-open)

Structured logging: All processes use Pino with JSON output in production. Development mode uses pino-pretty with colorized, human-readable output.

Heartbeat monitoring: External monitors can check:

  • heartbeat:scheduler (Redis key, 30s TTL) — scheduler is alive
  • heartbeat:worker:{pid} (Redis key, 30s TTL) — worker is alive
  • /health endpoint — API is alive with Redis and vault connectivity

Operational Runbook

Dead-Letter Queue (DLQ)

# Inspect failed jobs
node scripts/dlq.js inspect [--limit 50]

# Replay a single job
node scripts/dlq.js replay <job-id>

# Replay all DLQ jobs (resets attempt counts)
node scripts/dlq.js replay-all

# Purge the DLQ
node scripts/dlq.js purge

Scheduler tiers

Tier Cadence Trigger
Hot 30s refresh_interval_sec ≤ 30 in tenant.json
Warm 120s 30 < refresh_interval_sec ≤ 120
Cold 600s refresh_interval_sec > 120

Cadences have ±20% jitter applied. Tenants are force-refreshed immediately when their config hash changes (delta detection), regardless of cadence.

Common operations

# Restart a single service
sudo systemctl restart defi-worker

# Scale workers (run additional instances)
# Each worker uses its own PID for lock scoping
sudo systemctl start defi-worker    # additional instance

# Check Redis queue depth
redis-cli LLEN q:jobs
redis-cli LLEN q:jobs:priority
redis-cli LLEN q:jobs:dlq

# Check heartbeats
redis-cli GET heartbeat:scheduler
redis-cli KEYS "heartbeat:worker:*"

# Force-refresh a tenant (delete its refresh timestamp)
redis-cli DEL refresh:ts:<tenantId>

# Clear a stale lock
redis-cli DEL lock:job:SYNC_BALANCES:<tenantId>

Admin Scripts

All scripts are in scripts/ and run via node scripts/<name>.js.

Script Purpose
seed-vault.js Initialize vault with mock tenant, wallets, positions, incidents
backup.js Vault backup with run/list/prune/restore subcommands
dlq.js Manage the dead-letter queue
audit-chain.js Audit blockchain state
audit-export.js Export audit logs
migrate.js Data migration utilities
rules.js Manage risk rules
rpc-costs.js RPC cost tracking and reporting
log-rotation.js Log rotation management
load-test.js Generate load for testing
tmp-cleanup.js Clean up temporary files
restore-test.js Test backup restore process
verify-hardening.sh Check VPS security posture

Testing

Uses Node.js native test runner (node --test). No external test framework.

# Run all tests (unit + integration + smoke + load)
npm test

# Run a single test file
node --test tests/unit/scheduler.test.js

# Smoke tests only (60s timeout)
npm run test:smoke

# Load tests only (120s timeout)
npm run test:load

Test structure

tests/
├── unit/           # Fast, isolated tests (mocked dependencies)
├── integration/    # Tests with real adapters / Redis
├── smoke/          # End-to-end system verification
├── load/           # Performance and concurrency tests
├── fixtures/       # Mock data (positions, wallets, configs)
└── helpers/        # Test utilities and mock setup

Project Structure

atnine-guard/
├── config/
│   ├── chains.json              # Chain definitions (RPC URLs, type, costs)
│   ├── protocols.json           # Protocol definitions (addresses, types)
│   ├── majors.json              # Major token symbols for CEX tracking
│   └── default-playbooks.json   # Default incident response playbooks
│
├── src/
│   ├── api/
│   │   ├── server.js            # Fastify app setup and startup
│   │   ├── cache.js             # Redis-backed read-through cache
│   │   ├── middleware/          # Auth, rate limit, device binding, tenant scope
│   │   └── routes/              # posture, positions, decisions, incidents, etc.
│   │
│   ├── scheduler/
│   │   └── main.js              # Tick loop, tier scheduling, delta detection
│   │
│   ├── workers/
│   │   ├── main.js              # Job dequeue loop with lock management
│   │   └── handlers/            # Per-job-type handler implementations
│   │
│   ├── adapters/
│   │   ├── registry.js          # Lazy-loading adapter factory
│   │   ├── rpcFallback.js       # Primary/secondary endpoint failover
│   │   ├── interfaces.js        # Adapter interface contracts
│   │   ├── chains/              # EVM (Multicall), Cosmos (LCD)
│   │   └── protocols/           # UniV2, UniV3, OsmosisCL, Mars, AaveV3
│   │
│   ├── engine/
│   │   ├── rules/               # Risk rule evaluators
│   │   ├── thresholds.js        # Per-tenant threshold management
│   │   ├── playbooks.js         # Incident response orchestration
│   │   └── benchmarks.js        # Portfolio benchmarking
│   │
│   ├── telegram/                # Bot, send, broadcast, dedup, rate limit, etc.
│   ├── vault/                   # Filesystem storage (paths, init, fs, append, checksum)
│   ├── advisor/                 # LLM sandbox for analysis
│   ├── config/                  # Config loader
│   └── util/                    # Redis, queue, lock, errors, logger, metrics,
│                                # circuit breaker, rate limiter, HTTP, env
│
├── scripts/                     # Admin and operational scripts
├── systemd/                     # Service unit files for production
├── tests/                       # unit, integration, smoke, load
├── docs/                        # Architecture docs, ADRs, runbooks
├── docker-compose.yml           # Redis for local dev
├── package.json                 # Scripts, dependencies, engine constraints
├── .env.example                 # Environment variable template
└── .nvmrc                       # Node 20

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors