Vocabularies to support the SESAR sample registration system. The authoritative source files are SKOS RDF vocabularies serialized using Turtle syntax, located in the vocabulary/ directory.
HTML view of vocabularies is available at https://geosamples.github.io/vocabularies.
| File | Description | ConceptScheme URI |
|---|---|---|
SESAR_materials_extension.ttl |
Lithology/material types extending the iSamples material vocabulary | sesrs:materialvocabulary |
SESAR_sampled_feature_extension.ttl |
Geologic, physiographic, and environmental features extending iSamples Sampled Feature Type | sessf:sfvocabulary |
SamplingMethods.ttl |
Material sample collection methods (154 concepts) | secm:samplingmethods |
SESAR_material_object_type_extension.ttl |
Material sample object types extending iSamples | sesot:objecttypevocabulary |
mineralSKOS.ttl |
Mineral species vocabulary (6,299 concepts) with links to Mindat, RRUFF, Wikidata | gcmin:conceptScheme |
| File | Description | ConceptScheme URI |
|---|---|---|
samplePreparation_GSQ.ttl |
Queensland Geological Survey sample preparation methods | http://linked.data.gov.au/def/sample-preparation-methods |
sampling-method_GAv1-0.ttl |
Geoscience Australia sampling methods | http://pid.geoscience.gov.au/def/voc/ga/samplingmethod |
SamplePreparationSeaDataNet.ttl |
SeaDataNet sample preparation device categories | http://vocab.nerc.ac.uk/scheme/SDNDEV/current (top concept: ICAT02) |
SampleCollectionSeaDataNet.ttl |
SeaDataNet sample collection device categories | http://vocab.nerc.ac.uk/scheme/SDNDEV/current (top concept: ICAT01) |
The repository doubles as a GitHub Action that converts SKOS Turtle files to HTML documentation:
Turtle (.ttl) → SQLite (navocab/rdflib-sqlalchemy) → Markdown (vocab2mdCacheV2.py) → HTML (Quarto)
Each vocabulary is processed independently with a fresh SQLite database to prevent data from different vocabularies interfering with each other. This is important for files like the SeaDataNet vocabularies that share the same ConceptScheme URI but have different top concepts.
The vocabularies in this repository use several different SKOS patterns. The processing tools handle all of these:
- Standard hierarchy: Concepts use
skos:topConceptOf,skos:inScheme, andskos:broaderto establish hierarchy within a single ConceptScheme. - Inverse top concept declaration: Some vocabularies (GSQ, SeaDataNet) declare top concepts using
skos:hasTopConcepton the ConceptScheme rather thanskos:topConceptOfon the concepts. - No
skos:inScheme: Some vocabularies (GSQ) don't useskos:inSchemeon individual concepts. The code falls back to followingskos:broaderrelationships regardless of scheme membership. - Cross-scheme children: SeaDataNet concepts may have
skos:inSchemepointing to a different ConceptScheme than their parent. The broader-only fallback handles this.
| Workflow | Trigger | Description |
|---|---|---|
Process vocabularies (process_vocab.yml) |
Manual | Generates HTML for all vocabularies except mineralSKOS |
Process mineralSKOS (process_mineralSKOS.yml) |
Manual | Generates HTML for mineralSKOS only (~20 min due to size) |
build (integration.yml) |
Push to main/develop | Docker build verification and smoke test |
Test vocabularies (testdocs_process_vocab.yml) |
Manual | Tests pipeline with a small vocabulary subset |
- Add the
.ttlfile tovocabulary/(orvocabulary/samplingMethods/) - Add its filename (without
.ttlextension) and ConceptScheme URI to the pipe-delimitedinputttlandinputvocaburilists inprocess_vocab.yml. Subdirectory paths are supported (e.g.,samplingMethods/myVocab). Order must match between the two lists. - Add a section to
docs/readme.mdwith links to the HTML page and RDF source file - Trigger the "Process vocabularies" workflow from the Actions tab
- Docker image:
python:3.12-slimwith Quarto, rdflib, rdflib-sqlalchemy - Entry point:
.github/actions/github_action_main.py - Tools:
tools/vocab.py(CLI for DB loading),tools/vocab2mdCacheV2.py(markdown generation),tools/navocab/(SKOS/rdflib wrapper) - Dependencies:
setuptools<81is pinned becauserdflib-sqlalchemy0.5.4 requirespkg_resources - GitHub Pages: Serves from the
/docsdirectory on thegh-pagesbranch