The live wire for Tampa Bay.
A unified guide to live music, festivals, food, and family fun across the Tampa Bay area — Tampa, St. Petersburg, Clearwater, Brandon, Bradenton, Safety Harbor, Dunedin, and an Other catch-all for edge cases. Listings are aggregated daily from multiple sources, deduplicated, and ranked for readability. Curated Places (beaches, venues, food, and similar) ship on /places via a separate discovery pipeline. Lives at baywire.app.
Baywire runs 18 event adapters (src/lib/scrapers/index.ts). The daily matrix includes only sources that are enabled in the database and resolve through the adapter registry (scripts/ci/scrape-matrix.ts). Each job tries structured data first (JSON-LD, Tribe REST, Ticketmaster Discovery API, iCal) and falls back to OpenAI extraction when needed; venue rows are upserted into Places as events are processed. Taxonomy (config/taxonomy.json) drives tags, vibes, discovery verticals, and editorial re-classification — see ARCHITECTURE.md for the full ingestion design.
Vercel hosts the read-only Next.js app — scrapes, discovery, and backfill run on GitHub Actions, not on Vercel.
taxonomy.json ──publish──▶ DB taxonomy + backfill_jobs
│
GHA scrape (daily) ───────────────┼──▶ events / canonical_events
GHA discover (weekly) ────────────┼──▶ places
GHA backfill (every 6h) ──────────┘ (classify, sanitize, new profiles)
│
▼
Prisma Postgres + Accelerate
│
▼
Vercel Next.js (read-only)
- Next.js 16 (App Router, React 19, RSC by default) + Serwist (
@serwist/turbopack) for the PWA shell proxy.ts— anonymous guest profile cookie bootstrap- Tailwind CSS v4 + custom coastal palette
- Prisma ORM + Prisma Postgres + Prisma Accelerate; URL in
prisma.config.tsfor Prisma 7 CLI - OpenAI
gpt-4.1-miniwith Zod-typed structured outputs (OPENAI_BASE_URLfor compatible proxies) - Google Places API (New) + Vercel Blob for place discovery imagery
- Stytch for SMS sign-in (optional locally)
cheerio,p-limit, Playwright for browser/WAF adapters- GitHub Actions — scrape matrix, places discovery, taxonomy backfill, cleanup
| Slug | Site | Path | Notes |
|---|---|---|---|
eventbrite |
eventbrite.com | JSON-LD | Geo-search across metro cities, 2 pages each |
ticketmaster |
ticketmaster.com/discover/tampa | Discovery API | DMA 635 (Tampa-St. Pete-Sarasota) |
visit_tampa_bay |
visittampabay.com/events | JSON-LD | Official tourism |
visit_st_pete_clearwater |
visitstpeteclearwater.com | JSON-LD | /events + /events-festivals |
tampa_gov |
tampa.gov/calendar | JSON-LD + ICS | City calendar |
ilovetheburg |
ilovetheburg.com | Tribe REST API | St. Pete blog |
thats_so_tampa |
thatssotampa.com | Tribe REST API | Tampa-side blog |
tampa_bay_times |
tampabay.com/things-to-do | HTML + LLM | Editorial weekend picks |
tampa_bay_markets |
tampabaymarkets.com | HTML + LLM | Farmers' markets |
safety_harbor |
cityofsafetyharbor.com | RSS + LLM | CivicPlus feed |
side_splitters |
sidesplitterscomedy.com | HTML + LLM | Comedy club |
dont_tell_comedy |
donttellcomedy.com | HTML + LLM | Pop-up comedy |
funny_bone_tampa |
tampa.funnybone.com | HTML + LLM | DataDome; optional cookie secret in CI |
straz_center |
strazcenter.org | HTML + LLM | Playwright / Incapsula |
tampa_theatre |
tampatheatre.org | HTML + LLM | Live events + detail pages |
Browser-powered sources: dunedin_gov, unation, feverup, straz_center, funny_bone_tampa, visit_tampa_bay (listing) — see ARCHITECTURE.md.
npm install
cp .env.example .env.local
# Required: DATABASE_URL, OPENAI_API_KEY
# Optional: TICKETMASTER_API_KEY, GOOGLE_MAPS_API_KEY, BLOB_READ_WRITE_TOKEN, STYTCH_*, CRON_SECRET
npm run db:migrate:dev # or db:push for a quick schema sync
npm run ingestion:taxonomy-publish # seed taxonomy tables from config/taxonomy.json
npm run ingestion:scrape # full scrape (or: npm run ingestion:scrape -- eventbrite)
# Optional: ASYNC_CLASSIFY=1 to enqueue editorial instead of running inline (CI scrape uses this)
npm run devOpen http://localhost:3000.
- Create a database at console.prisma.io.
- Set
DATABASE_URLto theprisma+postgres://accelerate...URL. npm run db:migrate:dev(ordb:pushin early dev).
| Command | What it does |
|---|---|
npm run dev |
Next.js dev server |
npm run build |
Production build (postinstall runs prisma generate) |
npm run typecheck |
tsc --noEmit |
npm run lint |
ESLint |
npm run db:migrate:dev |
Create/apply a dev migration |
npm run db:migrate |
Apply migrations (production) |
npm run db:studio |
Prisma Studio |
npm run ingestion:scrape [-- <slug>] |
Event scrape (one source or all enabled) |
npm run ingestion:discover |
Google Places discovery (--help) |
npm run ingestion:suggest |
Suggest a place or event (--help) |
npm run ingestion:refresh |
Re-verify existing discovery places |
npm run ingestion:backfill |
Drain backfill queue (--limit, --kind) |
npm run ingestion:taxonomy-publish |
Publish config/taxonomy.json → DB + enqueue diff jobs |
npm run ingestion:cleanup |
Delete stale events / places (--skip-places) |
npm run ingestion:matrix |
Emit GHA scrape matrix JSON (CI only) |
npm run ops:blob-purge |
Purge Vercel Blob prefix (--execute to delete) |
Edit config/taxonomy.json — terms, aliases, discovery profiles, prompt bundles, rankingGuides. Bump version (and promptRevision when prompts change), then:
npm run ingestion:taxonomy-publish
npm run ingestion:backfill -- --limit 500- Inline classify (default): scrape/discover/refresh score rows before exit; rankings only show classified places.
ASYNC_CLASSIFY=1: enqueueclassify_*jobs instead (scheduled scrape workflow; drain withingestion:backfill).
Details: ARCHITECTURE.md — Taxonomy and Classification fingerprints.
Production splits Vercel (HTTP) from GitHub Actions (scheduled writes).
- Import repo; set
DATABASE_URL(and optionalCRON_SECRET, Stytch, Blob, Google keys). - No Vercel crons for scrapes —
vercel.jsonis empty.
See .github/workflows/README.md for the full index.
| Workflow | Schedule (UTC) | Command |
|---|---|---|
| ingestion-scrape.yml | Daily 12:00 | ingestion:scrape (matrix) |
| ingestion-discover.yml | Sun 08:00 | ingestion:discover |
| ingestion-suggest.yml | Manual | ingestion:suggest |
| ingestion-backfill.yml | Every 6h | ingestion:backfill |
| ingestion-taxonomy-publish.yml | Push config/taxonomy.json → main |
ingestion:taxonomy-publish |
| ingestion-cleanup.yml | Sun 09:00 | ingestion:cleanup |
Scrape secrets: DATABASE_URL, OPENAI_API_KEY; optional OPENAI_BASE_URL, OPENAI_EXTRACT_MODEL, TICKETMASTER_API_KEY, FUNNYBONE_SCRAPE_COOKIE.
Discover secrets: add GOOGLE_MAPS_API_KEY, BLOB_READ_WRITE_TOKEN.
Suggest secrets: same as discover (DB, OpenAI, Google Maps, Blob). Run manually via workflow_dispatch.
Backfill secrets: DATABASE_URL, OPENAI_API_KEY (for classify jobs).
workflow_dispatchon scrape workflow with optionalsourceslug.POST /api/cron/scrapewithAuthorization: Bearer $CRON_SECRET(202 + backgroundafter).
ARCHITECTURE.md Ingestion, taxonomy, pipelines (this doc)
config/taxonomy.json Taxonomy draft (publish to DB)
proxy.ts Guest profile cookie bootstrap
src/
app/ Next.js routes, UI, metrics, cron API
ingestion/ Taxonomy, backfill queue, pipeline entrypoints, adapters
taxonomy/ Snapshot, publish, diff, sanitize, validate
queue/ enqueue + process backfill jobs
pipelines/ events/scrape, places/discover|refresh, maintenance
adapters/ resolveAdapter (tribe, jsonld, custom, …)
kernel/ classification fingerprint
lib/
pipeline/ Scrape, canonicalize, editorial orchestration
scrapers/ Per-source adapters
extract/ OpenAI extraction + editorial
db/ Prisma client + queries (+ queriesTaxonomy)
places/ Google Places + discovery helpers
prisma/schema.prisma baywire schema
.github/workflows/ Scheduled ingestion — see .github/workflows/README.md
.github/actions/ Composite steps (install, playwright-chromium)
scripts/ CLI entrypoints — see scripts/README.md
ingestion/ scrape, discover, refresh, backfill, taxonomy-publish
maintenance/ cleanup-expired
ci/ scrape-matrix (GHA)
ops/ blob utilities
_lib/ shared runCli + arg helpers
- Per-host pacing ~1 req / 1.1s; extraction concurrency 4.
- Structured-first adapters avoid LLM calls when JSON-LD/ICS/API succeeds.
- Content-hash skips re-extraction when upstream payload unchanged.
- Classification fingerprint skips editorial only when content + taxonomy version + prompt revision match — bumping taxonomy version triggers re-classify via backfill (not a full re-scrape).
- Reduced HTML capped at 16k chars before
gpt-4.1-mini. - Read helpers use Accelerate
cacheStrategywhere noted in query modules.
See CONTRIBUTING.md for setup, conventions, and how to open a PR.
Baywire's source code is MIT licensed. Event and place data comes from third-party sources and is not necessarily redistributable under that license. The Baywire name and branding are not covered by the MIT license.
This project respects each source's robots.txt and only fetches public listing pages. Event cards link to originals; the footer lists enabled sources from the database. For removal requests, disable the adapter or open an issue.