This project addresses the challenge of fragmented agricultural crop data within the Swiss federal administration, where essential systems1 all use separate, non-harmonized crop terminologies. This lack of a "single source of truth" creates significant integration hurdles for digital tools.
In this project, we propose a unified master data system for crops and crop-related objects. The repository implements a sustainable solution by using a dedicated RDF ontology (crops ontology) and a graph database on LINDAS. This approach first connects (or "maps") the various crop terms from the different systems, creating a unified, machine-readable master data system that can be queried centrally. This graph not only allows for complex queries across formerly siloed data but also provides the stable, versioned foundation for the long-term, step-by-step harmonization of crop data across the Swiss agricultural sector. Click here to search for crops in the graph.
Warning
This project is still work in progress.
The general data model is doumented here. Note that SHACL Play! reads the data from rdf/shape/data-model.ttl on main.
You may inspect the crop taxonomy/ontology using WebVOWL here or read its turtle file here.
Alternatively, we have built a hierarchy viewer that allows you to visually inspect the hierarchical relationships of the crops.
Note
You may find more information on the repository wiki.
/data: source data files/docs: (static) html documents, rendered as github page/rdf: all RDF (turtle) files/data: tabular data/ontology: core vocabulary, crop taxonomy/processed: any automatically written turtle files -- do not change (manually)/shape: dedicated files for SHACL shapes
/src: source code/tests: pytest files
The data integration pipeline uses all the R and python scripts in the /scripts folder. The entire pipeline can be triggered with:
-
Add variables to
.envUSER=lindas-foag PASSWORD=******** GRAPH=https://lindas.admin.ch/foag/crops ENDPOINT=https://stardog.cluster.ldbar.ch/lindas EPPO=********
-
Start a virtual environment and install libraries:
python -m venv venv source venv/bin/activate # On Windows use: venv\Scripts\activate pip install -r src/python/requirements.txt
-
Run the ETL pipeline
sh scripts/graph-processing.sh -
Choose whether or not to generate and upload geodata.ttl, which enables queries and depiction of crop areas.
-
Make sure you pass all tests with
pytest tests -
Check out the results on LINDAS.
You can query the crop master data system using SPARQL.
Here's an example SPARQL query that gets you all cultivation type URIs and labels in German:
PREFIX schema: <http://schema.org/>
PREFIX : <https://agriculture.ld.admin.ch/crops/>
SELECT *
WHERE {
?crop a :CultivationType .
?crop schema:name ?name .
FILTER(LANG(?name)="de")
}
ORDER BY ?nameFootnotes
-
For example AGIS (direct payments), GRUD (fertilization), PSM registry (plant protection), ProVar (varieties), PGREL-NIS (gene bank) and others. ↩