Geospatial Site Suitability Assessment Using Foundation Model Embeddings
O3 EartH uses frozen OlmoEarth embeddings with lightweight classifiers to score infrastructure site suitability — no GPU needed at inference.
Ziming Qi | Northeastern University
Sentinel-2 (12 bands, multi-temporal) → OlmoEarth (frozen, 97M params) → 768-dim embedding → XGBoost → Score
- Multi-temporal: 4 seasonal scenes for solar/wind/hydro, single scene for geothermal
- Extract once: OlmoEarth converts satellite patches into 768-dim landscape fingerprints
- Score instantly: XGBoost classifies on CPU in milliseconds
| Method | AUC |
|---|---|
| Geography only (lat/lon) | 0.579 |
| OlmoEarth T=1 (single scene) | 0.907 |
| OlmoEarth T=multi (seasonal) | 0.924 |
| Spatial CV (leave-one-country-out, 63 countries) | 0.904 |
| Energy Type | AUC |
|---|---|
| Solar | 0.959 |
| Geothermal | 0.930 |
| Hydro | 0.918 |
| Wind | 0.898 |
8,000 samples across 212 countries, 4 energy types. Full validation in VALIDATION.md.
Web application with three components:
| Page | What it does |
|---|---|
| AI Chat | NVIDIA NIM LLM with full system knowledge |
| Site Selection | Map → pick location → Factor Engine + ML scores |
| Climate Risk | NASA POWER data + IPCC AR6 SSP projections |
MCP tools available for programmatic access. Details in PLATFORM.md.
| Source | Data | Auth |
|---|---|---|
| NASA POWER | Solar GHI, wind speed, temperature, cloud, precipitation | None |
| Open-Elevation | Terrain slope and gradient | None |
| Open-Meteo Flood | River discharge | None |
| USGS Earthquake | Seismic activity | None |
| EIA API v2 | US power plant data | API key |
| Planetary Computer | Sentinel-2 imagery | None |
git clone https://github.com/2imi9/O3earth.git
cd O3earthcp platform/.env.example platform/.env
# Edit with your keys (optional — Site Selection works without them)| Variable | Required For | Get One |
|---|---|---|
EIA_API_KEY |
US power plant data | eia.gov/opendata |
NVIDIA_API_KEY |
AI Chat (cloud) | build.nvidia.com |
Docker (recommended):
cd platform
docker compose up --buildManual (no Docker):
pip install -r requirements.txt
# Terminal 1 — API
cd platform
uvicorn api.main:app --port 8000
# Terminal 2 — UI
cd platform
streamlit run ui/app.py --server.port 8501Go to localhost:8501. Site Selection and Climate Risk work immediately — no GPU needed. AI Chat requires NVIDIA_API_KEY.
Dataset and pre-trained models on HuggingFace: 2imi9/O3earth
O3earth/
├── src/
│ ├── factors/ # 19 configurable scoring factors
│ ├── scoring/ # Suitability engine
│ ├── data_clients/ # Real-time API clients
│ ├── mcp/ # MCP tools + handlers
│ └── llm/ # NVIDIA NIM
├── platform/
│ ├── api/ # FastAPI backend
│ └── ui/ # Streamlit frontend
├── scripts/ # Data pipeline + training
├── data/ # Datasets + embeddings
└── results/ # Trained models + metrics
- OlmoEarth — Allen Institute geospatial foundation model
- TIML (Tseng et al.) — methodological precedent for embedding + classifier approach
- SatCLIP (Klemmer et al.) — location embeddings from satellite imagery
MIT