Introduction
A demonstration of federating file identification with a risk register. Wikidata connects DROID-derived PRONOM file identifiers with NARA Risk Register RDF data, to deliver an automated risk assessment against a digital collection.
Acronyms
-
DROID is a file identification tool from the UK National Archives.
-
NARA is the US National Archives and Records Administration.
-
PRONOM is a filetype registry, also from the UK National Archives.
-
RDF is a data serialisation proposed by W3C as the key technology standard behind Linked Open Data. An example of the NARA Risk Register data in this form can be seen here.
-
Wikidata is not an acronym! It is the largest open knowledge graph in the world, contains CC0 data from all disciplines including, relevant to this tool, digital preservation identifiers.
Prerequisites
Running this tool requires access to a valid DROID report.
An example report derived from the iPRES 2025 Pantry has been provided, a different report can be generated off any media collection.
JAVA can be installed on an Ubuntu machine using the following command:
sudo apt install default-jreThis install can be verified with java -version.
DROID can be downloaded from the relevant GitHub repository.
wget https://github.com/digital-preservation/droid/releases/download/6.9.4/droid-binary-6.9.4-bin.zipThis package can be unzipped with unzip droid-binary-6.9.4-bin.zip.
Command-line DROID can be tested with the following command:
./droid.sh --helpA DROID report can then be generated using the following command:
java -jar ~/droid/droid-command-line-6.9.4.jar /mnt/pantry -R > ~/droid_report.csvThis tool has been tested with Python 3.13, and has been built with uv as preferred package manager.
Deploy
The tool can be run with the command uv run main.py.
This should return a report which looks similar to:
RISK_LEVEL
No Data 2140
Low 1482
Moderate 382
High 2License