Skip to content
This repository was archived by the owner on Feb 18, 2026. It is now read-only.

24601/gastown-patrol

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

29 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Gas Town Patrol Scripts - UPDATE - You Probably Don't Need This Anymore!

Gas Town is doing great on its own now

Prescription synthetic GUPP for your Gas Town workforce so they keep flowing if at all possible when you step away.

Gas Town Patrol - Stuck, External GUPP, Back to Pumping

Why This Exists

First, I really respect & love Steve Yegge. How can you not? Gastown is awesome. The effort is amazing. The ideas are great. All great things start this way, and what he's built with the contributors that have joined up is amazing: no ifs, ands, or buts. I'm not just saying that, I obviously use it and am putting my time and energy into trying to be a part of the community in whatever way I can with what little time I have to dedicate to it and not just be a complain-ey user.

The inevitable but, maybe it's just me, but the cognitive load of the mixed metaphor of the Mad Max-meets-Landman-meets-Steampunk kinda already has me kinda turned around from the get go, and I kinda feel like the (admittedly little) I have dove into the internals, it's a very impressive Rube Goldberg machine (I do have a soft spot in my heart for Rube Goldberg machines, which is maybe why here I am layering on more layers of gold to it? This is one for my therapist, I am sure). But, we have Gastown, and I think it's (at the very least) a great research tool (and even tool to use for real stuff) into very practical exploration of real, applied systems to SWE (and beyond) and how this plays out in the very nearly future.

So, all that said, I'd normally just open a PR on the repo and fix things, but, well, first, grokking the code is not very easy, and, yeah, I've used all of the most amazing AIs, they help a bit, but I just don't have the level of comfort with the codebase myself and the whole mixed-metaphor/analogies that kinda work until they don't thing has me standing back from opening up a PR that may just make things worse (and waste good people's valuable time).

So, a lot of cope here in hopes gastown advances quick enough to not need cope, or something else comes along.

Disclaimer: I very well may be using Gas Town "creatively" (or outright incorrectly as the case maybe, although there does not seem to be one prescribed way, there are patterns in the codebase/discussion/design that def should be followed). These scripts exist because of a specific problem I kept hitting, and I won't say for sure if it is because the issue is in gastown, environmental, or between me and the keyboard (def possible).

Based on: Gas Town commit be96bb0 β€” Notes and observations may not reflect newer versions although I do keep it up to date every day or so.

The Problem(s)

flowchart TD
    subgraph BEFORE["😴 WHILE YOU'RE AWAY"]
        A[/"πŸ‘€ Human steps out"/]
        B[("πŸ–₯️ Daemon Running<br/>Heartbeats: βœ“<br/>Sessions: βœ“")]
        C[["🏭 Refinery<br/>Status: RUNNING<br/>Actually: IDLE"]]
        D[("πŸ“‹ MR Queue<br/>Growing...<br/>18 pending")]
    end

    subgraph AFTER["πŸŽ‰ WHEN YOU RETURN"]
        E[/"πŸ‘€ Human asks:<br/>'How's town?'"/]
        F{{"πŸ”” Mayor gets<br/>the nudge"}}
        G[["🏭 Refinery<br/>ACTUALLY<br/>processing"]]
        H[("πŸ“‹ Queue<br/>draining<br/>0 pending")]
    end

    A --> B
    B -->|"looks fine"| C
    C -->|"not moving"| D
    D -.->|"2-4 hours later"| E
    E -->|"simple question"| F
    F -->|"checks + nudges"| G
    G --> H

    style BEFORE fill:#2d2d2d,stroke:#ffa500,stroke-width:3px,color:#fff
    style AFTER fill:#1a3d1a,stroke:#00ff00,stroke-width:3px,color:#fff
    style D fill:#8b0000,stroke:#ff0000,color:#fff
    style H fill:#006400,stroke:#00ff00,color:#fff
Loading

The question: If a single nudge fixes everything, did I really need to be here? Or did the system just need a reminder to check on itself?

Gas Town has internal orchestration (Boot, Deacon, Daemon heartbeats, etc.) that should handle this. But in my environment, I kept finding:

  • Refineries showing "running" but not processing MRs
  • Queues backing up for hours
  • Daemon heartbeats passing while nothing moved
  • Coming back, asking "is stuff flowing?", and watching it unstick

What This Is NOT

flowchart LR
    subgraph NOPE["🚫 NOT TRYING TO BUILD THIS"]
        direction TB
        H["πŸ‘€ Human"]
        I["πŸ€– Fully Autonomous<br/>AI System"]
        J["🌴 Beach<br/>Retirement"]

        H -.->|"❌ disappears"| J
        I -->|"❌ runs forever<br/>without oversight"| K[("πŸ’° Profit???")]
    end

    style NOPE fill:#4a1010,stroke:#ff3333,stroke-width:3px,color:#fff
    style H fill:#333,stroke:#666,color:#fff
    style I fill:#333,stroke:#666,color:#fff
    style J fill:#333,stroke:#666,color:#fff
    style K fill:#333,stroke:#666,color:#fff
Loading

Gas Town is NOT a lights-out system. It requires active human involvement. I'm not trying to walk away foreverβ€”I'm trying to reduce the times I come back and find work stalled for no good reason.

What This IS

flowchart LR
    subgraph GOAL["βœ… WHAT I ACTUALLY WANT"]
        direction TB

        subgraph HUMAN_LIFE["πŸ‘€ Human Reality"]
            A["β˜• Lunch break"]
            B["🍽️ Dinner"]
            C["😴 Sleep"]
            D["πŸƒ Exercise"]
        end

        subgraph AUTO_NUDGE["⏰ Automated Nudges"]
            E["Every 15 min:<br/>PATROL CHECK"]
            F["Every 30 min:<br/>HEALTH CHECK"]
        end

        subgraph OUTCOME["πŸ“ˆ Result"]
            G["Work keeps<br/>flowing"]
            H["Only escalate<br/>REAL problems"]
        end
    end

    A & B & C & D -->|"away 1-4 hrs"| E & F
    E & F -->|"nudge mayor"| G
    G --> H
    H -->|"genuine issues only"| A

    style GOAL fill:#0d2818,stroke:#00cc44,stroke-width:3px,color:#fff
    style HUMAN_LIFE fill:#1a1a2e,stroke:#4a90d9,color:#fff
    style AUTO_NUDGE fill:#2d2a1a,stroke:#ffd700,color:#fff
    style OUTCOME fill:#1a2d1a,stroke:#00ff00,color:#fff
Loading

The goal: Keep work flowing during normal human activities. Escalate only when genuinely blocked. Don't bother me because a refinery forgot to check its queue.

The Scripts

flowchart TD
    subgraph CRON["⏰ CRON SCHEDULER"]
        TICK15["*/15 * * * *"]
        TICK30["*/30 * * * *"]
    end

    subgraph LAYER1["πŸ”΅ LAYER 1: PRODUCTIVITY PATROL"]
        direction TB
        MP["mayor-patrol.sh"]
        MP_CHECK["Check MR queues<br/>for ALL rigs"]
        MP_NUDGE["Nudge mayor:<br/>'Are refineries processing?'"]
        MP_ACTION["Mayor checks +<br/>nudges stuck refineries"]

        MP --> MP_CHECK --> MP_NUDGE --> MP_ACTION
    end

    subgraph LAYER2["πŸ”΄ LAYER 2: DEAD MAN'S SWITCH"]
        direction TB
        DM["deadman-switch.sh"]
        DM_DAEMON["Daemon alive?<br/>PID check"]
        DM_HEART["Heartbeat fresh?<br/><10 min old"]
        DM_MAYOR["Mayor session<br/>exists?"]
        DM_QUEUE["Queue backup<br/><20 MRs"]
        DM_ALERT["🚨 ALERT<br/>to logs + mail"]

        DM --> DM_DAEMON & DM_HEART & DM_MAYOR & DM_QUEUE
        DM_DAEMON & DM_HEART & DM_MAYOR & DM_QUEUE -->|"any fail"| DM_ALERT
    end

    subgraph LAYER3["🟑 LAYER 3: GUPP HEALTH"]
        direction TB
        GH["gupp-health-check.sh"]
        GH_ORPHAN["Find orphaned<br/>tasks"]
        GH_DB["Check beads<br/>db sync"]
        GH_KICK["Kick stuck<br/>refineries"]

        GH --> GH_ORPHAN & GH_DB & GH_KICK
    end

    TICK15 --> MP & GH
    TICK30 --> DM

    style CRON fill:#1a1a2e,stroke:#9966ff,stroke-width:2px,color:#fff
    style LAYER1 fill:#0a1929,stroke:#2196f3,stroke-width:3px,color:#fff
    style LAYER2 fill:#290a0a,stroke:#f44336,stroke-width:3px,color:#fff
    style LAYER3 fill:#29290a,stroke:#ffeb3b,stroke-width:3px,color:#fff
    style DM_ALERT fill:#8b0000,stroke:#ff0000,color:#fff
Loading

Why three layers?

  • Layer 1 (Blue): "Is work actually moving?" β€” Nudges mayor to check productivity
  • Layer 2 (Red): "Is the system even alive?" β€” External watchdog for catastrophic failures
  • Layer 3 (Yellow): "Is the infrastructure healthy?" β€” DB sync, orphans, stuck processes
Script Schedule Purpose
mayor-patrol.sh */15 * * * * Nudge mayor to check queues and refineries
deadman-switch.sh */30 * * * * Alert if daemon/mayor/system is down
gupp-health-check.sh */15 * * * * Orphan detection, refinery kicks, db sync

How It Works

sequenceDiagram
    participant Cron
    participant Mayor
    participant Refinery
    participant MRQueue

    loop Every 15 minutes
        Cron->>Mayor: πŸ“’ "PATROL: Check queues"
        Mayor->>MRQueue: gt mq list
        MRQueue-->>Mayor: 18 MRs pending (4h old)

        alt Queue backing up
            Mayor->>Refinery: πŸ“’ "Process queue!"
            Refinery->>MRQueue: Processes MRs
            MRQueue-->>Refinery: βœ… Merged
        else Queue empty
            Mayor-->>Mayor: βœ… All good
        end

        Mayor->>Cron: πŸ“ Log result
    end
Loading

Failure Modes & Solutions

flowchart LR
    subgraph FAIL1["❌ STUCK REFINERY"]
        direction TB
        F1_SYM["🏭"]
        F1_DESC["gt refinery status: RUNNING<br/>gt mq list: 18 MRs, 4h old<br/>Reality: Not processing"]
        F1_FIX["βœ… FIX: gt nudge refinery<br/>or gt refinery restart"]
    end

    subgraph FAIL2["❌ ZOMBIE SESSION"]
        direction TB
        F2_SYM["πŸ‘»"]
        F2_DESC["tmux pane exists<br/>Claude process dead or idle<br/>Daemon says 'already running'"]
        F2_FIX["βœ… FIX: gt deacon zombie-scan<br/>then respawn session"]
    end

    subgraph FAIL3["❌ ORPHANED TASK"]
        direction TB
        F3_SYM["πŸ”—"]
        F3_DESC["Task shows 'hooked'<br/>Assigned polecat is dead<br/>Work stuck in limbo"]
        F3_FIX["βœ… FIX: bd update --status open<br/>--assignee '' to reset"]
    end

    subgraph FAIL4["❌ BEADS DESYNC"]
        direction TB
        F4_SYM["πŸ’Ύ"]
        F4_DESC["bd doctor shows errors<br/>SQLite locked<br/>JSONL != database"]
        F4_FIX["βœ… FIX: bd doctor --fix<br/>or bd sync --import-only"]
    end

    subgraph FAIL5["❌ MULTI-TASK POLECAT"]
        direction TB
        F5_SYM["πŸ’₯"]
        F5_DESC["Multiple tasks hooked<br/>to same polecat<br/>API 400 concurrency error"]
        F5_FIX["βœ… FIX: Unhook all tasks<br/>Re-sling with --create<br/>One task per polecat"]
    end

    subgraph FAIL6["❌ DB METADATA LOSS"]
        direction TB
        F6_SYM["πŸ”₯"]
        F6_DESC["'Legacy database'<br/>'Repo fingerprint mismatch'<br/>Data not exporting"]
        F6_FIX["βœ… FIX: sqlite3 beads.db<br/>INSERT INTO metadata<br/>repo_id + bd_version"]
    end

    F1_SYM --> F1_DESC --> F1_FIX
    F2_SYM --> F2_DESC --> F2_FIX
    F3_SYM --> F3_DESC --> F3_FIX
    F4_SYM --> F4_DESC --> F4_FIX
    F5_SYM --> F5_DESC --> F5_FIX
    F6_SYM --> F6_DESC --> F6_FIX

    style FAIL1 fill:#1a0a0a,stroke:#ff6666,stroke-width:2px,color:#fff
    style FAIL2 fill:#1a0a0a,stroke:#ff6666,stroke-width:2px,color:#fff
    style FAIL3 fill:#1a0a0a,stroke:#ff6666,stroke-width:2px,color:#fff
    style FAIL4 fill:#1a0a0a,stroke:#ff6666,stroke-width:2px,color:#fff
    style FAIL5 fill:#1a0a0a,stroke:#ff6666,stroke-width:2px,color:#fff
    style FAIL6 fill:#1a0a0a,stroke:#ff6666,stroke-width:2px,color:#fff
    style F1_FIX fill:#0a1a0a,stroke:#66ff66,color:#fff
    style F2_FIX fill:#0a1a0a,stroke:#66ff66,color:#fff
    style F3_FIX fill:#0a1a0a,stroke:#66ff66,color:#fff
    style F4_FIX fill:#0a1a0a,stroke:#66ff66,color:#fff
    style F5_FIX fill:#0a1a0a,stroke:#66ff66,color:#fff
    style F6_FIX fill:#0a1a0a,stroke:#66ff66,color:#fff
Loading

Safe Slinging - One Task Per Polecat

Critical rule: Always use --create when slinging to a rig to ensure a fresh polecat.

flowchart LR
    subgraph UNSAFE["❌ UNSAFE: Reuses existing polecat"]
        U1["gt sling ho-task1 horizon"]
        U2["gt sling ho-task2 horizon"]
        U3["Both β†’ same polecat"]
        U4["πŸ’₯ Crash"]
        U1 --> U3
        U2 --> U3
        U3 --> U4
    end

    subgraph SAFE["βœ… SAFE: Forces fresh polecat"]
        S1["gt sling ho-task1 horizon --create"]
        S2["gt sling ho-task2 horizon --create"]
        S3["Each β†’ new polecat"]
        S4["βœ“ Works"]
        S1 --> S3
        S2 --> S3
        S3 --> S4
    end

    style UNSAFE fill:#2d0d0d,stroke:#ff4444,stroke-width:3px,color:#fff
    style SAFE fill:#0d2d0d,stroke:#44ff44,stroke-width:3px,color:#fff
    style U4 fill:#4a0000,stroke:#ff0000,color:#fff
    style S4 fill:#004a00,stroke:#00ff00,color:#fff
Loading

Why this happens:

  • gt sling <task> <rig> checks for existing polecats first
  • If polecat rust exists with an active session, it reuses it
  • No check if rust already has hooked work
  • Multiple tasks pile onto same polecat β†’ API concurrency error

Recovery if it happens:

# 1. Check for multi-hooked polecats
bd list --status=hooked | grep "<polecat-name>"

# 2. Unhook all tasks from the crashed polecat
bd update <task-id> --status open --assignee ""

# 3. Re-sling with --create
gt sling <task-id> <rig> --create

Warnings Are Smoke - Investigate!

Critical principle: When Gas Town operations produce warnings, they often indicate deeper issues. Don't dismiss them.

flowchart LR
    subgraph WARNINGS["⚠️ WARNING INVESTIGATION PROTOCOL"]
        direction TB

        W1["Warning appears<br/>during gt sling/operation"]
        W2["STOP: Don't assume<br/>'it probably worked'"]
        W3["Debug with:<br/>BD_DEBUG_ROUTING=1"]
        W4["Check witness mail<br/>for escalations"]
        W5["Fix root cause<br/>before continuing"]

        W1 --> W2 --> W3 --> W4 --> W5
    end

    subgraph EXAMPLE["πŸ“‹ REAL EXAMPLE"]
        direction TB
        E1["Witness ESCALATION mail"]
        E2["'Convoy misrouting' or<br/>'Multi-task polecat crash'"]
        E3["Investigate immediately"]
        E4["Root cause before<br/>continuing"]

        E1 --> E2 --> E3 --> E4
    end

    WARNINGS --> EXAMPLE

    style WARNINGS fill:#2d2d0d,stroke:#ffcc00,stroke-width:3px,color:#fff
    style EXAMPLE fill:#0d2d2d,stroke:#66ccff,stroke-width:2px,color:#fff
    style W2 fill:#4a0000,stroke:#ff0000,color:#fff
    style E4 fill:#004a4a,stroke:#00ffff,color:#fff
Loading

Warning types:

Warning What It Means Action
couldn't set agent hook: issue not found Expected for cross-database (convoy in HQ, work in rig) Usually safe to ignore - work still assigned via assignee field
bead not found Route target doesn't exist Check routes.jsonl paths
redirect chain A→B→C routing loop Check .beads/redirect files
Witness ESCALATION mail Real problem - investigate immediately Read mail, fix root cause

Design note: Convoys (hq-cv-*) intentionally live in HQ beads and are "cross-prefix capable" - they can track work in any rig. The agent hook warning for cross-database scenarios is expected and documented in gastown source.

DANGER ZONE - Database Operations That Cause Data Loss

NEVER run these commands - they will cause data loss:

# ❌ DANGEROUS - Causes export to think nothing needs exporting = DATA LOSS
sqlite3 .beads/beads.db "DELETE FROM export_hashes;"
sqlite3 .beads/beads.db "DELETE FROM dirty_issues;"

# ❌ DANGEROUS - Removes tracking of what's been synced
rm .beads/beads.db-wal
rm .beads/beads.db-shm

# ❌ DANGEROUS - Can corrupt database
rm .beads/beads.db && bd init  # Loses all data not in JSONL

Why this matters:

  • export_hashes tracks which issues have been exported to JSONL
  • dirty_issues tracks which issues need to be exported
  • If you clear these, bd sync thinks "nothing dirty = nothing to export"
  • Result: JSONL gets overwritten with minimal data = DATA LOSS

Safe alternatives:

# βœ… SAFE - Let bd doctor handle sync issues
bd doctor --fix

# βœ… SAFE - Force import from JSONL without losing data
bd sync --import-only

# βœ… SAFE - Fix repo fingerprint without data loss
sqlite3 .beads/beads.db "INSERT OR REPLACE INTO metadata (key, value) VALUES ('repo_id', '<your-repo-id>');"

# βœ… SAFE - Restore from git if data was lost
git checkout HEAD -- .beads/issues.jsonl
bd sync --import-only

If data loss occurs:

  1. Check git: git show HEAD:.beads/issues.jsonl | wc -l
  2. Restore: git checkout HEAD -- .beads/issues.jsonl
  3. Re-import: bd sync --import-only

Root Cause Analysis

flowchart TD
    subgraph ROOT["πŸ” WHY DOES THIS HAPPEN?"]
        direction TB

        subgraph CLAUDE["πŸ€– CLAUDE CODE QUIRKS"]
            C1["stdin goes quiet<br/>β†’ agent stops responding"]
            C2["Context fills up<br/>β†’ session ends abruptly"]
            C3["Idle too long<br/>β†’ process hibernates"]
        end

        subgraph TMUX["πŸ“Ÿ TMUX BEHAVIOR"]
            T1["Pane stays alive<br/>even if process dies"]
            T2["Respawn tries old pane ID<br/>β†’ 'can't find pane' error"]
            T3["Session exists β‰ <br/>session is working"]
        end

        subgraph DAEMON["πŸ‘Ή DAEMON LOGIC GAP"]
            D1["Checks: Does session exist?"]
            D2["Does NOT check:<br/>Is session productive?"]
            D3["'Already running'<br/>= skip spawn<br/>= nothing happens"]
        end

        subgraph BEADS["πŸ“Ώ BEADS FRAGILITY"]
            B1["SQLite locks under<br/>concurrent access"]
            B2["routes.jsonl paths<br/>can drift from reality"]
            B3["Redirect chains<br/>cause resolution failures"]
        end
    end

    CLAUDE --> D3
    TMUX --> D3
    D3 --> STUCK["πŸ›‘ WORK STALLS"]
    BEADS --> STUCK

    style ROOT fill:#0d0d1a,stroke:#6666ff,stroke-width:3px,color:#fff
    style CLAUDE fill:#1a1a0d,stroke:#ffcc00,color:#fff
    style TMUX fill:#0d1a1a,stroke:#00ccff,color:#fff
    style DAEMON fill:#1a0d0d,stroke:#ff6666,color:#fff
    style BEADS fill:#0d1a0d,stroke:#66ff66,color:#fff
    style STUCK fill:#4a0000,stroke:#ff0000,stroke-width:3px,color:#fff
Loading

What Each Script Catches

Failure mayor-patrol deadman-switch gupp-health
Stuck refinery βœ…
Queue backup βœ… βœ… (>20)
Daemon down βœ…
Mayor down βœ…
Orphaned tasks βœ…
DB sync issues βœ…
Stale heartbeat βœ…
Route misconfig βœ…
Multi-task polecat (manual)

Note: Multi-task polecat crashes require manual detection. Always use --create when slinging.

Beads Routing Issues

A common but subtle failure: beads routing misconfiguration causes "bead not found" errors or work going to wrong rigs.

flowchart TD
    subgraph ROUTING["πŸ”€ BEADS ROUTING ARCHITECTURE"]
        direction TB

        subgraph HQ["πŸ“ HQ (.beads/routes.jsonl)"]
            R1["Route: hq-* β†’ /Users/ec2-user/gt"]
            R2["Route: px-* β†’ /Users/ec2-user/gt/kalshi"]
            R3["Route: ho-* β†’ /Users/ec2-user/gt/horizon"]
        end

        subgraph RESOLVE["πŸ” Resolution Path"]
            Q1["bd show px-abc123"]
            Q2["Check routes.jsonl"]
            Q3["Find matching prefix"]
            Q4["Follow to target .beads/"]

            Q1 --> Q2 --> Q3 --> Q4
        end
    end

    subgraph FAILURES["❌ ROUTING FAILURES"]
        direction TB

        subgraph F1["Dead Target"]
            F1A["Route: px-* β†’ /old/path"]
            F1B["Path doesn't exist"]
            F1C["'bead not found'"]
            F1A --> F1B --> F1C
        end

        subgraph F2["Redirect Chain"]
            F2A["A/.beads/redirect β†’ B"]
            F2B["B/.beads/redirect β†’ C"]
            F2C["Resolution fails"]
            F2A --> F2B --> F2C
        end

        subgraph F3["Actual Routing Bug"]
            F3A["Route px-* points to<br/>/old/deleted/path"]
            F3B["bd show px-task fails"]
            F3C["'bead not found'"]
            F3A --> F3B --> F3C
        end
    end

    subgraph CHECKS["βœ… GUPP-HEALTH ROUTING CHECKS"]
        direction TB
        C1["Validate route targets exist"]
        C2["Detect redirect chains"]
        C3["Log warnings for misconfigs"]
    end

    ROUTING --> FAILURES
    FAILURES --> CHECKS

    style ROUTING fill:#0d1a2d,stroke:#4a90d9,stroke-width:3px,color:#fff
    style HQ fill:#1a2d1a,stroke:#66ff66,color:#fff
    style RESOLVE fill:#1a1a2e,stroke:#9966ff,color:#fff
    style FAILURES fill:#2d0d0d,stroke:#ff6666,stroke-width:3px,color:#fff
    style F1 fill:#1a0a0a,stroke:#ff4444,color:#fff
    style F2 fill:#1a0a0a,stroke:#ff4444,color:#fff
    style F3 fill:#1a0a0a,stroke:#ff4444,color:#fff
    style CHECKS fill:#0d2d1a,stroke:#00ff00,stroke-width:3px,color:#fff
    style F1C fill:#4a0000,stroke:#ff0000,color:#fff
    style F2C fill:#4a0000,stroke:#ff0000,color:#fff
    style F3C fill:#4a0000,stroke:#ff0000,color:#fff
Loading

What gupp-health-check.sh validates:

Check What It Does Warning Triggered
Route targets Validates each route's target path exists ROUTE WARNING: Target path does not exist
Redirect chains Detects A β†’ B β†’ C redirect loops REDIRECT CHAIN in <rig>
Route count Reports how many routes configured Info: Found N routes

Debug routing manually:

BD_DEBUG_ROUTING=1 bd show <bead-id>

Check routes config:

cat ~/gt/.beads/routes.jsonl

Common issues:

  • Route paths don't match actual .beads/ locations
  • Redirect chains (A β†’ B β†’ C) cause resolution failures
  • Prefix mismatches (e.g., hq-* vs px-* issues)

The gupp-health-check.sh now actively validates routing configuration and warns about misconfigurations. Complex redirect chains are detected automatically.

GUPP Gap Deep Dive (Code-Level Analysis)

The daemon's checkGUPPViolations() function (in internal/daemon/lifecycle.go) only fires when:

  1. hook_bead != "" (agent has work assigned)
  2. AND agent hasn't updated in > 30 minutes

The gap: When polecats COMPLETE their work:

  • Their hook_bead becomes empty
  • No GUPP violation is detected
  • They sit at prompts asking "what should I do?"
  • But there's ready work (bd ready) they SHOULD pick up
# What the daemon checks:
if agent.HookBead == "" {
    continue // No hooked work - no GUPP violation possible  ← THE GAP
}

# Evidence from a real session:
Capable polecat output:
"My hook is empty - Would you like me to:
1. Continue working on my current branch task?
2. Wait for the sling to be re-sent?"
β†’ It's ASKING instead of DECIDING

Daemon log:
"Witness for kalshi already running, skipping spawn"
β†’ Session exists, so no new polecat spawned

bd list agents:
No agents with hook_beads
β†’ All hooks empty, so no GUPP violation detected

The intended flow vs actual behavior:

Step Intended Actual
Polecat completes molecule Calls gt done β†’ session dies Asks user "what next?"
Witness sees no polecat Spawns new polecat for ready work "already running, skipping"
Ready work piles up Gets assigned to new polecats Sits unworked

Why patrol scripts fix this: They provide external GUPP nudges that bypass all the internal checks. A simple "check for work" message triggers the propulsion principle directly.

Gas Town's Boot, Deacon, and Daemon should handle recovery. If they don't for you, it might be:

  1. A bug (report it upstream) ← Issue #918 documents this exact gap
  2. Environmental (your tmux/shell/machine)
  3. User error (hi, that's me)

Installation

# Clone
git clone https://github.com/24601/gastown-patrol.git
cd gastown-patrol

# Copy scripts (adjust GT_ROOT in each script if not ~/gt)
cp scripts/*.sh ~/gt/scripts/
chmod +x ~/gt/scripts/*.sh

# Create log directory
mkdir -p ~/gt/logs

# Add to crontab
crontab -e

Add these lines:

*/15 * * * * /path/to/gt/scripts/mayor-patrol.sh >> /path/to/gt/logs/mayor-patrol.log 2>&1
*/30 * * * * /path/to/gt/scripts/deadman-switch.sh >> /path/to/gt/logs/deadman.log 2>&1
*/15 * * * * /path/to/gt/scripts/gupp-health-check.sh >> /path/to/gt/logs/gupp-health.log 2>&1

Logs

# Patrol activity
tail -f ~/gt/logs/mayor-patrol.log

# System health
tail -f ~/gt/logs/deadman.log

# Critical alerts only
cat ~/gt/logs/deadman-alert.txt

Docs (Assistant Crib Notes)

The docs/ folder contains notes I created for the agents themselves.

Why Crib Notes?

flowchart LR
    subgraph WITHOUT["❌ WITHOUT CRIB NOTES"]
        direction TB
        W1["πŸ€– New session starts"]
        W2["πŸ€” No memory of past work"]
        W3["πŸ”„ Rediscover same things"]
        W4["😀 Repeat same mistakes"]
        W1 --> W2 --> W3 --> W4
    end

    subgraph WITH["βœ… WITH CRIB NOTES"]
        direction TB
        C1["πŸ€– New session starts"]
        C2["πŸ“š Read accumulated notes"]
        C3["πŸ’‘ Know where to look"]
        C4["🎯 Debug faster, fewer mistakes"]
        C1 --> C2 --> C3 --> C4
    end

    WITHOUT ~~~ WITH

    style WITHOUT fill:#2d0d0d,stroke:#ff4444,stroke-width:3px,color:#fff
    style WITH fill:#0d2d0d,stroke:#44ff44,stroke-width:3px,color:#fff
    style W4 fill:#4a0000,stroke:#ff0000,color:#fff
    style C4 fill:#004a00,stroke:#00ff00,color:#fff
Loading

The notes capture:

  • Where things are in the codebase
  • Debug flags and environment variables
  • Common failure modes and their fixes
  • Architectural decisions and why they exist

How to Use

Reference them in your project's CLAUDE.md:

## Gas Town Reference

When debugging Gas Town issues, consult:
**Location:** `~/path/to/gastown-patrol/docs/`

Key files:
- `GASTOWN_COMPREHENSIVE_NOTES.md` - Complete knowledge base
- `CODE_ARCHITECTURE.md` - Code structure deep dive

Now when mayor (or any agent) hits a problem, it can check the notes instead of fumbling through the same discovery process you already did.

Contents

  • GASTOWN_COMPREHENSIVE_NOTES.md β€” Everything I've learned (patterns, gotchas, debug techniques)
  • CODE_ARCHITECTURE.md β€” How the code is structured (where to look for what)
  • QUICK_REFERENCE.md β€” Commands cheat sheet (copy-paste debugging)
  • INDEX.md β€” Navigation and version highlights

Contributing

If you're hitting similar issues, PRs welcome. If you know why this happens and I'm just doing it wrong, issues very welcomeβ€”I'd love to fix the root cause instead of coping around it.

License

MIT. Use at your own risk. This is cope, not a solution.

Releases

No releases published

Packages

 
 
 

Contributors

Languages