Skip to content

2imi9/O3earth

Repository files navigation

O3 EartH

Geospatial Site Suitability Assessment Using Foundation Model Embeddings

O3 EartH uses frozen OlmoEarth embeddings with lightweight classifiers to score infrastructure site suitability — no GPU needed at inference.

Ziming Qi | Northeastern University

How It Works

Sentinel-2 (12 bands, multi-temporal) → OlmoEarth (frozen, 97M params) → 768-dim embedding → XGBoost → Score
  • Multi-temporal: 4 seasonal scenes for solar/wind/hydro, single scene for geothermal
  • Extract once: OlmoEarth converts satellite patches into 768-dim landscape fingerprints
  • Score instantly: XGBoost classifies on CPU in milliseconds

Key Results

Method AUC
Geography only (lat/lon) 0.579
OlmoEarth T=1 (single scene) 0.907
OlmoEarth T=multi (seasonal) 0.924
Spatial CV (leave-one-country-out, 63 countries) 0.904
Energy Type AUC
Solar 0.959
Geothermal 0.930
Hydro 0.918
Wind 0.898

8,000 samples across 212 countries, 4 energy types. Full validation in VALIDATION.md.

Platform

Web application with three components:

Page What it does
AI Chat NVIDIA NIM LLM with full system knowledge
Site Selection Map → pick location → Factor Engine + ML scores
Climate Risk NASA POWER data + IPCC AR6 SSP projections

MCP tools available for programmatic access. Details in PLATFORM.md.

Data Sources

Source Data Auth
NASA POWER Solar GHI, wind speed, temperature, cloud, precipitation None
Open-Elevation Terrain slope and gradient None
Open-Meteo Flood River discharge None
USGS Earthquake Seismic activity None
EIA API v2 US power plant data API key
Planetary Computer Sentinel-2 imagery None

Quick Start

1. Clone

git clone https://github.com/2imi9/O3earth.git
cd O3earth

2. Configure API keys (optional)

cp platform/.env.example platform/.env
# Edit with your keys (optional — Site Selection works without them)
Variable Required For Get One
EIA_API_KEY US power plant data eia.gov/opendata
NVIDIA_API_KEY AI Chat (cloud) build.nvidia.com

3. Run

Docker (recommended):

cd platform
docker compose up --build

Manual (no Docker):

pip install -r requirements.txt

# Terminal 1 — API
cd platform
uvicorn api.main:app --port 8000

# Terminal 2 — UI
cd platform
streamlit run ui/app.py --server.port 8501

4. Open

Go to localhost:8501. Site Selection and Climate Risk work immediately — no GPU needed. AI Chat requires NVIDIA_API_KEY.

Dataset and pre-trained models on HuggingFace: 2imi9/O3earth

Project Structure

O3earth/
├── src/
│   ├── factors/           # 19 configurable scoring factors
│   ├── scoring/           # Suitability engine
│   ├── data_clients/      # Real-time API clients
│   ├── mcp/               # MCP tools + handlers
│   └── llm/               # NVIDIA NIM
├── platform/
│   ├── api/               # FastAPI backend
│   └── ui/                # Streamlit frontend
├── scripts/               # Data pipeline + training
├── data/                  # Datasets + embeddings
└── results/               # Trained models + metrics

References

  • OlmoEarth — Allen Institute geospatial foundation model
  • TIML (Tseng et al.) — methodological precedent for embedding + classifier approach
  • SatCLIP (Klemmer et al.) — location embeddings from satellite imagery

License

MIT

About

Geospatial Site Suitability Assessment Using OlmoEarth Embeddings

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages