Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 23 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ This repository uses [Docusaurus](https://docusaurus.io/) to publish the documen

- `blog/`: Entries for the Overture engineering blog available at docs.overturemaps.org/blog
- `community/`: The community page that showcases Overture data being used in the wild.
- `community-projects.json` - source data for all community project cards
- `og-image-cache.json` - cached `og:image` URLs for entries without an explicit `image` field (see [OG Image Cache](#og-image-cache) below)
- `docs/`: The main documentation pages available at docs.overturemaps.org/. The sidebar for these pages is manually curated in the `sidebars.js` file.
- `release-blog/`: Release notes for every Overture data release. The latest release is always available at <https://docs.overturemaps.org/release/latest/>
- Notice there is no `schema reference` folder. See below.
Expand Down Expand Up @@ -41,11 +43,31 @@ Now navigate to <http://localhost:3000> to see the live preview.
- `npm run build` - Build the production site (also shows locale/translation warnings and broken link checks)
- `npm run serve` - Serve the built site locally
- `npm run deploy` - Deploy the site
- `npm run clear` - Clear the Docusaurus cache
- `npm run fetch-og` - Fetch and cache `og:image` metadata for community project entries (see [OG Image Cache](#og-image-cache) below)
- `npm run swizzle` - Customize Docusaurus components by "ejecting" them for modification
- `npm run write-translations` - Generate translation files for internationalization
- `npm run write-heading-ids` - Auto-generate heading IDs for better linking

## OG Image Cache

The community page displays project cards with images. Each entry in `community/community-projects.json` can include an optional `"image"` field. For entries without one, the site falls back to a cached `og:image` fetched from the project's URL.

The cache lives in `community/og-image-cache.json` and is committed to the repository so CI builds never make external HTTP requests.

**When to run it:** after adding or updating entries in `community-projects.json`.

```shell
npm run fetch-og
```

The script (`scripts/fetch-og-images.mjs`):
1. Skips entries that already have an explicit `"image"` field
2. Re-validates any previously cached non-empty URLs via a HEAD request (`Content-Type: image/*`) and clears invalid ones
3. Fetches the HTML for uncached entries, extracts `og:image`, and validates the URL before writing it to the cache
4. Is idempotent - safe to re-run at any time

Cards with no image (neither explicit nor cached) display a branded gradient placeholder.

## LLM-Friendly Content

Each production build generates [llmstxt.org](https://llmstxt.org)-standard files for use with LLMs and AI tools:
Expand Down
42 changes: 42 additions & 0 deletions community/og-image-cache.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
{
"https://city2graph.net/examples/morphological_graph_from_overturemaps.html": "",
"https://docs.fused.io/blog/overture-tiles/": "https://fused-magic.s3.us-west-2.amazonaws.com/blog-assets/social_jennings.png",
"https://tech.marksblogg.com/asian-building-footprints-from-google-maps.html": "",
"https://www.crunchydata.com/blog/postgis-meets-duckdb-crunchy-bridge-for-analytics-goes-spatial": "https://imagedelivery.net/lPM0ntuwQfh8VQgJRu0mFg/5c4288f4-e78e-4a78-d288-17bbe1902300/public",
"https://chatgpt.com/g/g-onSLtzQQB-overture-maps-gpt": "",
"https://github.com/Krizz/fetch_overture": "https://opengraph.githubassets.com/737d862ed4e51eb12aadd77180cbd831171b2aac59a1858dd8d6bc1f39ce0667/Krizz/fetch_overture",
"https://overture-maps-docs.vercel.app/zh-Hant": "",
"https://whosonfirst.org/blog/2024/08/16/dedupe/": "https://www.whosonfirst.org/blog/2024/08/16/dedupe/images/219609_e312862475b94323_b.jpg",
"https://github.com/arthurgailes/overtureR": "https://opengraph.githubassets.com/2be9ce4e69cf7ff1541efa31dfbb438e1e631890a4ee47ddd431adac3b099bf3/arthurgailes/overtureR",
"https://supabase.com/blog/postgis-generate-vector-tiles": "https://supabase.com/images/blog/postgis_vector_tiles/overture_postgis_mvt.png",
"https://www.dbreunig.com/2024/06/25/using-duckdb-spatial-joins-to-map-overture-gers-ids-to-us-census-fips-codes.html": "https://www.dbreunig.com/img/denver_building_og.jpg",
"https://github.com/denironyx/overturemapsr": "https://opengraph.githubassets.com/4192b579bd027d1997292924c496b2183ab5367a828d3aab0809f163d079d41e/denironyx/overturemapsr",
"https://walker-data.com/posts/overture-buildings/": "",
"https://www.openstreetmap.org/user/Kshitijraj%20Sharma/diary": "",
"https://developmentseed.org/lonboard/latest/examples/overture-maps/": "https://developmentseed.org/lonboard/latest/assets/images/social/examples/overture-maps.png",
"https://carto.com/blog/overture-maps-data-now-on-the-cloud-use-it-with-carto": "https://carto.com/cdn.prod.website-files.com/63483ad423421bd16e7a7ae7/662fd131b47e1840271ad569_Overture%20Maps%20data%20now%20on%20the%20cloud%20how%20to%20use%20it%20with%20CARTO.webp",
"https://wherobots.com/overture-maps-data-cloud-native-geoparquet-apache-sedona/": "https://wherobots.com/wp-content/uploads/2024/04/Screenshot-2024-01-22-at-1.20.58PM.jpg",
"https://pypi.org/project/overturemapsdownloader/": "",
"https://docs.fused.io/basics/tutorials/overture/": "",
"https://community.esri.com/t5/arcgis-data-interoperability-blog/go-cloud-native-overture-geoparquet-from-object/ba-p/1371965": "",
"https://python.plainenglish.io/downloading-overture-map-foundations-buildings-data-using-apache-sedona-with-docker-python-and-473f5175f241": "",
"https://tech.marksblogg.com/tokyo-walking-tour-guide.html": "",
"https://msbarry.github.io/planetiler-overture-demo/#13.99/42.35625/-71.06989": "",
"https://engineering.tomtom.com/overture-transportation-network-linear-referencing/": "",
"https://www.spatialnode.net/articles/how-to-query-overture-maps-foundation-data-in-arcgis-pro-with-duck-dbc094f9": "https://storage.googleapis.com/spatialnodefiles/article_covers/7f690560-9989-4b73-a895-7a6b66c4d84fgroup35701.jpg",
"https://tech.marksblogg.com/overture-gis-data.html": "",
"https://www.esri.com/arcgis-blog/products/arcgis-online/mapping/enriching-overture-data-with-gers/": "",
"https://github.com/bdon/overture-tiles": "https://opengraph.githubassets.com/d81965506e1f642724e79393fcce9866185a4343922ea38910cbcdbf5af1c956/OvertureMaps/overture-tiles",
"https://www.openstreetmap.org/user/mikelmaron/diary/402600": "https://www.openstreetmap.org/assets/osm_logo_256-ed028f90468224a272961c380ecee0cfb73b8048b34f4b4b204b7f0d1097875d.png",
"https://community.esri.com/t5/geoanalytics-engine-blog/using-overture-maps-data-in-geoanalytics-engine/ba-p/1341493": "",
"https://open.gishub.org/open-buildings/": "",
"https://medium.com/@singh.tanya3298/lets-explore-overture-maps-3209c25d6c97": "",
"https://lyonwj.com/blog/importing-overture-maps-neo4j-aws-athena-spatial-sql-query": "https://lyonwj.com/static/images/overture-graph/import4.png",
"https://shi-works.github.io/Overture-Maps-Data-for-GIS/#16.18/35.680945/139.767552/-12.7/60": "",
"https://observablehq.com/d/9847c08c46f56ed6": "https://static.observableusercontent.com/thumbnail/93483d37715016640ac96554c8483a7b904059fe3f626729efb9e9d80603365b.jpg",
"https://medium.com/@dr.jiayu/harnessing-overture-maps-data-apache-sedonas-journey-from-parquet-to-geoparquet-d99f7767a499": "",
"https://www.postholer.com/articles/Overature-Cheat-Sheet": "",
"https://feyeandal.me/blog/access_overture_data_using_athena": "",
"https://til.simonwillison.net/overture-maps/overture-maps-parquet": "https://s3.amazonaws.com/til.simonwillison.net/41a6a07bd194e630fb59d653871c103a.jpg",
"https://beta.source.coop/repositories/cholmes/overture/description/": ""
}
1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
"version": "0.2.1",
"private": true,
"scripts": {
"fetch-og": "node scripts/fetch-og-images.mjs",
"docusaurus": "docusaurus",
"start": "npm run docusaurus start",
"build": "npm run docusaurus build",
Expand Down
120 changes: 120 additions & 0 deletions scripts/fetch-og-images.mjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
#!/usr/bin/env node
/**
* Fetches og:image metadata for community project entries that have no
* explicit `image` field and writes the results to community/og-image-cache.json.
*
* Run manually after adding new entries:
* npm run fetch-og
*
* The cache file is committed so CI builds never need to hit external URLs.
*/

import { readFileSync, writeFileSync } from 'node:fs';
import { resolve, dirname } from 'node:path';
import { fileURLToPath } from 'node:url';

const __dirname = dirname(fileURLToPath(import.meta.url));
const root = resolve(__dirname, '..');
const entriesPath = resolve(root, 'community/community-projects.json');
const cachePath = resolve(root, 'community/og-image-cache.json');

const FETCH_TIMEOUT_MS = 10_000;
const DELAY_BETWEEN_REQUESTS_MS = 300;

const entries = JSON.parse(readFileSync(entriesPath, 'utf8'));

let cache = {};
try {
cache = JSON.parse(readFileSync(cachePath, 'utf8'));
} catch {
// cache doesn't exist yet — start fresh
}

// Re-validate previously cached non-empty values + fetch missing ones
const needsValidation = entries.filter((e) => !e.image && cache[e.url]);
const needsFetch = entries.filter((e) => !e.image && !Object.hasOwn(cache, e.url));

if (needsValidation.length === 0 && needsFetch.length === 0) {
console.log('og-image cache is up to date. Nothing to fetch.');
process.exit(0);
}

if (needsValidation.length > 0) {
console.log(`Validating ${needsValidation.length} cached entries…`);
for (const entry of needsValidation) {
process.stdout.write(` ${entry.url} … `);
const valid = await isImageUrl(cache[entry.url]);
if (!valid) {
cache[entry.url] = '';
console.log('✗ invalid, clearing');
} else {
console.log('✓');
}
await sleep(DELAY_BETWEEN_REQUESTS_MS);
}
}

console.log(`Fetching og:image for ${needsFetch.length} entries…`);

function extractOgImage(html) {
// Match <meta property="og:image" content="..."> in any attribute order
const match = html.match(
/<meta[^>]+property=["']og:image["'][^>]+content=["']([^"']+)["']/i,
) ?? html.match(
/<meta[^>]+content=["']([^"']+)["'][^>]+property=["']og:image["']/i,
);
return match?.[1] ?? null;
}

async function fetchOgImage(url) {
const controller = new AbortController();
const timer = setTimeout(() => controller.abort(), FETCH_TIMEOUT_MS);
try {
const res = await fetch(url, {
signal: controller.signal,
headers: { 'User-Agent': 'OvertureMaps-docs-og-fetcher/1.0' },
});
if (!res.ok) return null;
const html = await res.text();
const imgUrl = extractOgImage(html);
if (!imgUrl) return null;
return (await isImageUrl(imgUrl)) ? imgUrl : null;
} catch {
return null;
} finally {
clearTimeout(timer);
}
}

async function isImageUrl(url) {
const controller = new AbortController();
const timer = setTimeout(() => controller.abort(), FETCH_TIMEOUT_MS);
try {
const res = await fetch(url, {
method: 'HEAD',
signal: controller.signal,
headers: { 'User-Agent': 'OvertureMaps-docs-og-fetcher/1.0' },
});
const ct = res.headers.get('content-type') ?? '';
return res.ok && ct.startsWith('image/');
} catch {
return false;
} finally {
clearTimeout(timer);
}
}

function sleep(ms) {
return new Promise((r) => setTimeout(r, ms));
}

for (const entry of needsFetch) {
process.stdout.write(` ${entry.url} … `);
const img = await fetchOgImage(entry.url);
cache[entry.url] = img ?? '';
console.log(img ? '✓' : '(no og:image)');
await sleep(DELAY_BETWEEN_REQUESTS_MS);
}

writeFileSync(cachePath, JSON.stringify(cache, null, 2) + '\n', 'utf8');
console.log(`\nCache written to community/og-image-cache.json`);
Loading
Loading