This project is a full-stack data engineering and web application designed to analyze and visualize the public transportation network of Budapest (BKK). The system serves a dual purpose:
- Accessibility Analytics: Extracting, processing, and analyzing static and dynamic transit data to calculate the wheelchair accessibility footprint across tram, bus, and metro routes.
- Real-Time Monitoring: Providing an interactive, web-based map that displays live vehicle locations, station data, and the aggregated accessibility analytics via a responsive UI.
The architecture is designed for high-throughput data ingestion, spatial querying, and low-latency client updates.
- Frontend / Client: Next.js, React, Mapbox GL JS (or Leaflet) for interactive mapping.
- Backend API & Services: Python (FastAPI), WebSockets for real-time client communication.
- Data Ingestion & Streaming: Apache Kafka (for handling BKK GTFS-Realtime streams).
- Database: PostgreSQL with the PostGIS extension (for spatial data and station coordinates).
- Infrastructure & Deployment: Docker, Docker Compose (containerizing the backend, database, and message brokers).
- BKK GTFS (Static): Used to map out routes, trips, stops, and base station metadata. Contains the core accessibility flags (e.g.,
wheelchair_boardinginstops.txtandwheelchair_accessibleintrips.txt). - BKK GTFS-Realtime (Dynamic): Consumed to track live vehicle positions, trip updates, and service alerts.
- Extraction: Scheduled cron jobs/Python workers pull the latest static GTFS zip files from the BKK Open Data portal.
- Transformation: * Parse relational CSVs (stops, routes, trips).
- Join
tripswithroutesto categorize transit types (tram, bus, metro). - Normalize spatial coordinates into PostGIS geometry types.
- Join
- Loading: Upsert processed data into the PostgreSQL database.
The analysis engine calculates the following key performance indicators (KPIs):
- Global Network Share: Percentage of total active stops equipped for wheelchair boarding.
- Route-Specific Accessibility: Ratio of accessible trips to total trips per route (highlighting routes with partial low-floor vehicle deployment).
- Modal Breakdown: Comparative accessibility scores between the Metro network (mostly accessible via elevators), Trams (mixed, reliant on low-floor CAF/Combino units), and Buses.
- Spatial Coverage: Geographic density of accessible stops, identifying "transit deserts" for users with reduced mobility.
- Interactive Map Component: Renders the map of Budapest. Includes toggleable layers for Routes, Stops, and Live Vehicles.
- Accessibility Dashboard: A side-panel or modal system displaying the analytical KPIs (charts and percentage metrics) generated in Phase 1.
- Visual Coding: Stations and routes are color-coded based on their accessibility status (e.g., Green for fully accessible, Yellow for partial, Red for non-accessible).
To ensure the map reflects current transit conditions without overwhelming the BKK API or the client:
- Ingestion: A backend Python worker continuously polls the BKK GTFS-Realtime endpoint and pushes updates to a Kafka topic.
- Processing: A consumer service filters the raw protobuf data, extracting only necessary fields (Vehicle ID, Lat/Lon, Route ID, Delay).
- Broadcasting: The FastAPI backend maintains active WebSocket connections with the Next.js clients. It broadcasts lightweight JSON payloads containing state changes (deltas) rather than the entire vehicle array, minimizing bandwidth.
GET /api/v1/stats/accessibility: Returns JSON payload of network-wide accessibility percentages.GET /api/v1/routes/{route_id}: Returns GeoJSON line-strings for map rendering and specific route accessibility data.GET /api/v1/stops: Returns GeoJSON points for all stations withwheelchair_boardingstatus.
ws://{host}/ws/vehicles: Subscribes the client to the live vehicle location stream.
The application environment is strictly isolated using Docker Compose, ensuring parity between development and production.
db-service: PostgreSQL + PostGIS image.kafka-broker&zookeeper: For managing the real-time message stream.ingestion-worker: Python container dedicated to fetching and parsing BKK data.api-service: Python FastAPI container serving REST endpoints and managing WebSocket connections.web-frontend: Next.js node container serving the application UI.
- Historical Analysis: Storing real-time delay data to analyze punctuality correlations with accessibility (e.g., do low-floor deployments impact dwell times?).
- Routing Engine: Implementing pgRouting to allow users to generate custom A-to-B navigation paths that strictly utilize wheelchair-accessible nodes.
- Alert Integration: Incorporating BKK service alerts to notify users if an elevator at a metro station goes out of service dynamically.