Production-ready ETL system for weather data using Open-Meteo API, PostgreSQL, and Streamlit
A complete ETL pipeline that extracts real-time weather data from the Open-Meteo API, transforms it using Polars DataFrames, loads it into PostgreSQL, and provides an interactive Streamlit dashboard for visualization.
┌─────────────────────────────────────────────────────────────────┐
│ Weather Data Pipeline │
└─────────────────────────────────────────────────────────────────┘
EXTRACT TRANSFORM LOAD VISUALIZE
│ │ │ │
┌────▼────┐ ┌────▼────┐ ┌────▼────┐ ┌─────▼────┐
│ Open- │ JSON │ Polars │ Batch │Postgre- │ Query │Streamlit │
│ Meteo │──────────►│ Engine │────────►│ SQL 15 │────────►│Dashboard │
│ API │ │ │ │ │ │ │
└─────────┘ └─────────┘ └─────────┘ └──────────┘
• Retry logic • Validation • Connection pool • Plotly charts
• Rate limiting • Type safety • Idempotent writes • Smart caching
- High Performance: Polars DataFrames process data 5-10x faster than pandas
- Reliable: Automatic retry logic with exponential backoff
- Secure: Parameterized queries prevent SQL injection
- Scalable: Connection pooling supports 100+ cities
- Interactive Dashboard: Three-page Streamlit app with filtering and visualizations
- Python 3.11+
- Docker and Docker Compose
- Clone and install dependencies:
git clone <repository-url>
cd weather-data-pipeline
# With uv (recommended)
uv sync
# Or with pip
python -m venv .venv
source .venv/bin/activate
pip install -e .- Start the database:
docker-compose up -d- Run the pipeline:
uv run python src/pipeline.py- Launch the dashboard:
uv run streamlit run dashboard/app.pyTip
Access the dashboard at http://localhost:8501 and pgAdmin at http://localhost:5050
The pipeline fetches weather data for 5 cities by default:
- Cairo, London, Tokyo, New York, Sydney
Default values work for local development. Create a .env file if needed:
cp .env.example .envKey variables:
DB_HOST,DB_PORT,DB_NAME,DB_USER,DB_PASSWORD- Database connectionAPI_BASE_URL- Open-Meteo API endpoint (default provided)DASHBOARD_PORT- Streamlit port (default: 8501)
The Streamlit dashboard provides three pages:
| Page | Description |
|---|---|
| Current Conditions | Real-time weather with temperature, humidity, wind, precipitation |
| Historical Trends | Time-series charts over custom date ranges |
| City Comparison | Side-by-side metrics across multiple cities |
Interactive Controls:
- Multi-city selection filter
- Date range picker
- Temperature unit toggle (°C / °F)
- 5-minute automatic data caching
Detailed guides available in docs/:
- Setup Guide - Installation, configuration, troubleshooting
- Architecture - System design and data flow
- Performance - Benchmarks and optimization strategies
- API Reference - Developer guide for extensions
| Component | Technology |
|---|---|
| Language | Python 3.11+ |
| Data Processing | Polars 0.20+ |
| Database | PostgreSQL 15 |
| Dashboard | Streamlit 1.35+ |
| Visualization | Plotly |
| Containerization | Docker Compose |
Database connection failed
docker-compose ps # Check container status
docker-compose logs db # View error logsPipeline fails with API errors
curl -I https://api.open-meteo.com # Check connectivityDashboard shows no data
uv run python src/pipeline.py # Run pipeline firstBuilt by Eslam Mohamed