Skip to content

[DRAFT] Add all importable RDFLib namespaces and ones needed for EU project#144

Closed
amy-jenn wants to merge 13 commits intoVisualMeaning:masterfrom
amy-jenn:prefixes
Closed

[DRAFT] Add all importable RDFLib namespaces and ones needed for EU project#144
amy-jenn wants to merge 13 commits intoVisualMeaning:masterfrom
amy-jenn:prefixes

Conversation

@amy-jenn
Copy link

@amy-jenn amy-jenn commented Jun 2, 2025

Jira ticket: SMP-3243

I have no idea what I am doing and this is a best guess make-existing-shape apply to believed new-things - someone needs to review this as though my whole premise and udnerstanding may be flawed.

We need to refer to the following ontologies in the EU project:

  1. ECCF https://op.europa.eu/en/web/eu-vocabularies/dataset/-/resource?uri=http://publications.europa.eu/resource/dataset/54i
  2. https://op.europa.eu/en/web/eu-vocabularies/dataset/-/resource?uri=http://publications.europa.eu/resource/dataset/euvoc Currencies
  3. https://op.europa.eu/en/web/eu-vocabularies/dataset/-/resource?uri=http://publications.europa.eu/resource/dataset/cdm - Treaties
  4. https://op.europa.eu/web/eu-vocabularies/dataset/-/resource?uri=http://publications.europa.eu/resource/dataset/budget-ontology - Projects and funding

In turn, those ontologies refer to the following prefixes:

  1. @Prefix eccf: http://data.europa.eu/54i# .
  2. @Prefix eccf: http://data.europa.eu/54i# .
  3. @Prefix eubud: http://data.europa.eu/3rx/ontology/budget# .
  4. @Prefix adms: http://www.w3.org/ns/adms# .
  5. @Prefix adms1: http://purl.org/adms# .
  6. @Prefix dc: http://purl.org/dc/elements/1.1/ .
  7. @Prefix dcat: http://www.w3.org/ns/dcat# .
  8. @Prefix dcterms: http://purl.org/dc/terms/ .
  9. @Prefix dg: https://w3id.org/dingo# .
  10. @Prefix eurio: http://data.europa.eu/s66# .
  11. @Prefix fabio: http://purl.org/spar/fabio/ .
  12. @Prefix foaf: http://xmlns.com/foaf/0.1/ .
  13. @Prefix frapo: http://purl.org/cerif/frapo/ .
  14. @Prefix locn: http://www.w3.org/ns/locn/ .
  15. @Prefix org: http://www.w3.org/ns/org# .
  16. @Prefix owl: http://www.w3.org/2002/07/owl# .
  17. @Prefix patent: http://data.epo.org/linked-data/def/patent/ .
  18. @Prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .
  19. @Prefix rdfs: http://www.w3.org/2000/01/rdf-schema# .
  20. @Prefix schema: http://schema.org/ .
  21. @Prefix skos: http://www.w3.org/2004/02/skos/core# .
  22. @Prefix turtle: http://www.semanticweb.org/owl/owlapi/turtle# .
  23. @Prefix vann: http://purl.org/vocab/vann/ .
  24. @Prefix xml: http://www.w3.org/XML/1998/namespace .
  25. @Prefix vann: http://purl.org/vocab/vann/ .
  26. @Prefix xsd: http://www.w3.org/2001/XMLSchema# .

Further, RDFLib refers to several of the above as "Importable Namespces" - I have tried to link in explanations, but am not sure I have correctly identified everything:

  1. BRICK - A Uniform Metadata Schema for Buildings
  2. CSVW - creating Metadata descriptions for Tabular Data
  3. DC - Dublin Core Metadata Element Set - set of fifteen "core" elements (properties) for describing resources
  4. DCAT - Data Catalog Vocabulary - designed to facilitate interoperability between data catalogs published on the Web
  5. DCMITYPE - ? not sure what this is exactly - maybe it should be excluded
  6. DCTERMS - fifteen terms of the Dublin Core™ Metadata Element Set (also known as "the Dublin Core") plus several dozen properties, classes, datatypes, and vocabulary encoding schemes.
  7. DCAM - ? Dublin Core Metadata Element Set - set of fifteen "core" elements (properties) for describing resources
  8. DOAP - RDF Schema and XML vocabulary to describe software projects
  9. FOAF - already present
  10. ODRL2 - policy expression language that provides a flexible and interoperable information model, vocabulary, and encoding mechanisms for representing statements about the usage of content and services
  11. ORG - Core organization ontology - already being used in projects
  12. OWL - already present
  13. PROF - Profiles Vocabulary - describing relationships between standards/specifications, profiles of them and supporting artifacts such as validating resources
  14. PROV - W3C PROVenance Interchange Ontology - represent and interchange provenance information generated in different systems and under different contexts
  15. QB - Vocabulary for multi-dimensional (e.g. statistical) data publishing
  16. RDF - we're already familiar with this
  17. RDFS - we're already familiar with this
  18. SDO - space domain, of the space environment. The ontology name is general enough to encompass various astronomical and astronautical domains
  19. SH - Shapes Constraint Language (SHACL) Vocabulary
  20. SKOS - we're already familiar with this
  21. SOSA - Sensor, Observation, Sample, and Actuator (SOSA) Ontology - same as SSN?
  22. SSN - Semantic Sensor Network Ontology
  23. TIME - OWL-Time: ontology of temporal concepts, for describing the temporal properties of resources in the world or described in Web pages
  24. VANN - A vocabulary for annotating vocabulary descriptions
  25. VOID - Vocabulary of Interlinked Datasets (VoID) - RDF Schema vocabulary for expressing metadata about RDF datasets
  26. WGS - Basic Geo (WGS84 lat/long) Vocabulary
  27. XSD - W3C XML Schema Definition Language - facilities for describing the structure and constraining the contents of XML documents

Lastly, the sm_platform makes reference to the following prefixes:
dc: "http://purl.org/dc/elements/1.1/", - included in RDFLib section above
dcat: "http://www.w3.org/ns/dcat#", - included in RDFLib section above
dcterms: "http://purl.org/dc/terms/", - included in RDFLib section above
gist: "https://ontologies.semanticarts.com/gist/", - already included in sheet-to-triples
foaf: "http://xmlns.com/foaf/0.1/", - already included in sheet-to-triples
oa: "http://www.w3.org/ns/oa#", - not included above
org: "http://www.w3.org/ns/org#", - included in RDFLib section above
owl: "http://www.w3.org/2002/07/owl#", - already included in sheet-to-triples
rdf: "http://www.w3.org/1999/02/22-rdf-syntax-ns#", - included in RDFLib section above
rdfs: "http://www.w3.org/2000/01/rdf-schema#", - included in RDFLib section above
skos: "http://www.w3.org/2004/02/skos/core#", - included in RDFLib section above
vm: "http://visual-meaning.com/rdf/", - already included in sheet-to-triples
webprotege: "http://webprotege.stanford.edu/", - not included above

These prefixes aren't included in our sheet-to-triples, and I suspect probably should be. I could be wrong here.

Copy link
Contributor

@AchilleSalaun AchilleSalaun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure there is no type or syntax error. Especially, please ensure that the automatic GitHub verifications are good: they seem to have failed.

Other comments would improve the code and its maintainability. It is not mandatory but the sooner, the better :)

g.bind('vann', rdflib.namespace.VANN)
g.bind('void', rdflib.namespace.VOID)
g.bind('wgs', rdflib.namespace.WGS)
g.bind('xsd', rdflib.namespace.XSD)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be more synthetic:

import re
import rdflib

for prefix, namespace in rdflib.namespace.__dict__.items(): # This is how you iterate across all the components of a Python module.
    if re.match('[A-Z0-9]+$', prefix):                      # The convention seems to be: namespaces are referred with name in upper case and numbers. 
                                                            # Filtering through type was subject to special cases.
        g.bind(prefix.lower(), namespace)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will leave in more explicit solution as is - as discussed

g.bind('patent', rdflib.namespace.('http://data.epo.org/linked-data/def/patent/'))
g.bind('schema', rdflib.namespace.('http://schema.org/'))
g.bind('turtle', rdflib.namespace.('http://www.semanticweb.org/owl/owlapi/turtle/'))
g.bind('xml', rdflib.namespace.('http://www.w3.org/XML/1998/namespace'))
Copy link
Contributor

@AchilleSalaun AchilleSalaun Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For readability, I would introduce a dictionary:

prefix2namespace = {
    #custom defined rdflib namespaces
    'gist' : 'https://ontologies.semanticarts.com/gist/',
    
    #custom defined rdflib namespaces for eu project
    'adms' : 'http://www.w3.org/ns/adms/',
    ...
}

for prefix, namespace in prefix2namespace.items():
    g.bind(prefix, rdflib.Namespace(namespace))

The advantage is that it makes the code one step closer to have these namespaces defined in a dedicated JSON file.

Additionally, using loops implies less risks of copy paste errors like the error mentioned below:

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've done this, but partially incorrectly - I need to check this with Achille

#custom defined rdflib namespaces
g.bind('gist', rdflib.Namespace('https://ontologies.semanticarts.com/gist/'))
#custom defined rdflib namespaces for eu project
g.bind('adms', rdflib.namespace.('http://www.w3.org/ns/adms/'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should fail in many ways:

  Cell In[54], line 2
    rdflib.namespace.(namespace)
                     ^
SyntaxError: invalid syntax

or

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[52], line 2
      1 namespace= 'https://ontologies.semanticarts.com/gist/'
----> 2 rdflib.namespace(namespace)

TypeError: 'module' object is not callable

Explanation:
A dot . sneaked between namespace and the parenthesis, leading to a SyntaxError.

Additionally, Namespace with a capital letter is the class constructor, so it's what you want. namespace all in lower case is referring to a module instead, so it triggers a TypeError.

The example you should follow is given line 144:

g.bind('gist', rdflib.Namespace('https://ontologies.semanticarts.com/gist/'))

Copy link
Contributor

@AchilleSalaun AchilleSalaun Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional note: the syntax rdflib.namespace.FOAF is correct because this refers to a stored namespace in the module rdflib.namespace. You can see it as a sort of dictionary. In your case, you want the class constructor rdflib.Namespace because you want to build a new namespace from a string.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - I couldn't figure this out - I've addressed this

@alightwing
Copy link
Contributor

As discussed, this change is not required - if a specific set of prefixes/bindings are required, simply create a .ttl file with your preferred prefix definitions and add them via --add-graph as the last graph. If it's added last, the prefixes from that file will override any previous definitions for the same IRI prefix.

@alightwing alightwing closed this Jun 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants