The repository provides a dockerised executor of the sema.harvest module. This executor is used to dump the dataset catalogue from emobon.
It does so starting from https://data.emobon.embrc.eu/ and crawling the published triples to satisfy the instructions from the config/harvest-emobon-dcat.yml
The packaged artefacts from this work are available at https://github.com/orgs/emo-bon/packages.
To use this one only needs
- to decide what image to run
- either a published release package
- or a local build
- to set the i/o for the process
- mainly the folder where output (dump file) needs to be written (mapped as docker-volume
/resultsroot) - essential environment variables to pass
- mainly the folder where output (dump file) needs to be written (mapped as docker-volume
- pass all of the above in a call to
docker run
In detail:
$ version="latest" # or pick an available release tag from https://github.com/orgs/emo-bon/packages
# (optionally) verify availability by manual pull
$ docker pull ghcr.io/emo-bon/emobon_ddcat:${version} # should pull the image without errors
# variable setting to inject
$ out="./path_to/where/the_dump_result/should_be_written"
# typically just use cwd
$ out="."
# actually run it
$ docker run --rm --name "emo-bon_ddcat" --volume ${out}:/resultsroot ghcr.io/emo-bon/emobon_ddcat:${version}for windows users with bash shell, the above command should be run as:
$ docker run --rm --name "emo-bon_ddcat" --volume /$PWD/${out}:/resultsroot ghcr.io/emo-bon/emobon_ddcat:${version}"To build your own local image, or to get involved in furthering this work: See Contributors Guide