A Cloudflare Workers API (Hono) that scrapes Investing.com’s economic calendar and returns structured events.
- Exposes HTTP endpoints via Hono on Cloudflare Workers
- Calls Investing.com (session bootstrap + filtered POST) and paginates through results
- Parses the returned HTML into JSON events
- Cleans non-breaking spaces and nested markup
- Returns a simplified event shape tuned for programmatic use
- Entry:
src/index.ts - Routes:
src/routes/*health.ts:GET /healthcalendar.ts:GET /economic-calendar
- Service:
src/services/investing.ts- Session init, headers,
URLSearchParams, pagination vialimit_from, aggregate HTML
- Session init, headers,
- Parser lib:
src/lib/parser.tsparseEconomicCalendar(html): HTML →EconomicEvent[]stripTags()/cleanHtmlLight()for fast cleaning
Directory snapshot
src/
index.ts // mounts routes
routes/
health.ts // /health
calendar.ts // /economic-calendar
services/
investing.ts // fetch + pagination (HTML accumulation)
lib/
parser.ts // HTML → events, cleaners
wrangler.toml // main = src/index.ts
{
"success": true,
"count": 96,
"from_date": "01/12/2024",
"to_date": "31/01/2025",
"timezone": "GMT",
"events": [
{
"id": "511639",
"timestamp": 1733137200,
"event": "ISM Manufacturing PMI (Nov)",
"actual": "48.4",
"forecast": "47.7",
"previous": "46.5"
}
]
}Notes:
timestampis UTC seconds. Preferdata-event-datetime; fallback to day-header epoch.actual/forecast/previousmay be empty for holidays or if upstream lacks values.
GET /→ metadataGET /health→{ status: 'OK', timestamp }GET /economic-calendar?from_date=DD/MM/YYYY&to_date=DD/MM/YYYY- Filters applied: USA (
country[]=5), high importance (importance[]=3), timezone UTC (timeZone=0) - Pagination: iterates
limit_fromuntil last id repeats or no rows
- Filters applied: USA (
Examples
curl "http://localhost:8787/health"
curl "http://localhost:8787/economic-calendar?from_date=01/12/2024&to_date=31/01/2025"Prereqs: Node 18+ (or Bun), Cloudflare Wrangler.
npm install
npm run dev
# or
bun run devLocal URL: http://localhost:8787
npm run deploy
# or
bun run deployWrangler config (wrangler.toml) points main to src/index.ts.
- Routing
- Add a sub-router in
src/routes/<feature>.tsand export aHonoinstance - Mount in
src/index.tsviaapp.route('/path', router)
- Add a sub-router in
- Services
- Keep remote calls and pagination in
src/services - Routes should validate inputs and shape responses only
- Keep remote calls and pagination in
- Parsing & cleaning
- Keep HTML parsing in
src/lib/parser.ts - Use
stripTags()+cleanHtmlLight(); avoid heavyweight HTML decoders - For table cells, capture
([\s\S]*?)then strip tags
- Keep HTML parsing in
- Time handling
- Always output a UTC
timestamp(seconds)
- Always output a UTC
- Pagination policy
- Iterate pages with
limit_from; stop on repeated last id,rows_num == 0, or empty HTML; cap pages withmaxPages
- Iterate pages with
- TypeScript
- Use simple domain types (e.g.,
EconomicEvent) - Guard optional values with
?.,??, and length checks
- Use simple domain types (e.g.,
- Errors
- Return 400 for validation, 500 for unhandled exceptions
- Add
/economic-calendar/rawto return upstream HTML for diagnostics (new route + service reuse) - Add query params (countries, importance) by passing through to service’s
URLSearchParams
MIT