You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CCO provides a robust way of modeling information with details that may not always be needed when used with less structured data. It would be easier to write and read (both for humans and machines) if more shortcuts were provided, which could expand to the full details when necessary.
For example, suppose we have data about someone's age in years and want to reason about whether or not they can vote, or require a guardian's consent in some context, e.g. healthcare. As far as I can tell, an age is a measurement of a temporal region of someone's lifetime. This makes the example a little more convoluted than other types of information, like a person's name. (Also, in most use cases, age should be derived from a birthdate anyway.) But I'm using it to highlight the difference between a simple representation like "Bob is 18 years old" with the full representation in CCO. There are 11 RDF triples here:
# The phrase "Bob is 18 years old"
:BobsAgeInYears a cco:DocumentField ; # a cco:InformationBearingEntity
cco:has_integer_value 18 ;
cco:uses_measurement_unit cco:YearMeasurementUnit ;
bfo:0000101 :BobsAgeInformation . # is carrier of (at some time)# The proposition expressed by "Bob is 18 years old" (or "Bob is 216 months old")
:BobsAgeInformation a cco:MeasurementInformationContentEntity ;
cco:is_a_measurement_of :BobsLifetime .
# Bob's lifetime
:BobsLifetime a cco:MultiYearTemporalInterval ;
cco:is_temporal_region_of :BobsLife .
# Bob's life
:BobsLife a bfo:0000015 . # process# Bob himself
:Bob a cco:Person ;
bfo:0000056 :BobsLife . # participates in (at some time)# Note: replace `bfo:0000101` with `ro:0010002` ("is carrier of"), or `bfo:0000056` with `ro:0000056` ("participates in") if needed.
The point here is that a lot of these details aren't necessary for common use cases of age data.
The annotation property cco:is_tokenized_by provides one kind of shortcut in cases where we need not model extra details about information bearers, such as their provenance. However, some of these details such as units need be added to the token (ex. "10 years"). Worse, the annotation property can't be used with logical axioms. For example, I can't express that someone must be over 18 years old to vote using the "is_tokenized_by" annotation. But it's still useful as a shortcut.
This graph shows what all of this looks like and how many "hops" are involved. I added more classes for context:
With the age example, I find it more useful to model the information bearer than the information content. Information content about age is only useful if we want to represent, for example, that the phrase "18 years" means the same as "216 months" -- i.e. they bear/carry the same information content. Otherwise, we typically want to just reason about the integers and units associated information bearers. In which case, the content is metaphysical baggage.
A property chain axiom attached to something like "carries information about" could provide a shortcut between a carrier of information and what that information is about. Similarly, we could make more specific subproperties like "carries measurement of" which would fit the example better.
This would only be useful in reasoning if we already had a complex chain of relations as in the both example, which would entail the shortcut, but not the other way around. (To get the opposite of a shortcut, something like a SPARQL construct could be used for expanding these shortcut relations into the complex chain of relations, involving the information contents and bearers, etc.)
In any case, the shortcut is at least somewhat interoperable with the full CCO representation.
Using this in the previous example simplifies things a bit:
:BobsAgeInYears a cco:DocumentField ; # a cco:InformationBearingEntity
cco:has_integer_value 18 ;
cco:uses_measurement_unit cco:YearMeasurementUnit ;
:carries_information_about :BobsLifetime .
:BobsLifetime a cco:MultiYearTemporalInterval ;
cco:is_temporal_region_of :BobsLife .
:BobsLife a bfo:0000015 . # process
:Bob a cco:Person ;
bfo:0000056 :BobsLife . # participates in (at some time)
Simplifying further, we get something that looks a little more familiar to someone working with more common data formats. This is probably want we want unless we need to reason about someone's life or the temporal region it occurs on. More likely, we just need to reason about the integer value associated with their age, and associate it to them.
:BobsAgeInYears a cco:DocumentField ; # a cco:InformationBearingEntity
cco:has_integer_value 18 ;
cco:uses_measurement_unit cco:YearMeasurementUnit ;
:carries_information_about :Bob .
:Bob a cco:Person .
We could use cco:is_subject_of_field as a more specific subproperty of the inverse and use an anonymous node to get something that looks even more familiar. Now we have 5 triples instead of 11.
CCO provides a robust way of modeling information with details that may not always be needed when used with less structured data. It would be easier to write and read (both for humans and machines) if more shortcuts were provided, which could expand to the full details when necessary.
For example, suppose we have data about someone's age in years and want to reason about whether or not they can vote, or require a guardian's consent in some context, e.g. healthcare. As far as I can tell, an age is a measurement of a temporal region of someone's lifetime. This makes the example a little more convoluted than other types of information, like a person's name. (Also, in most use cases, age should be derived from a birthdate anyway.) But I'm using it to highlight the difference between a simple representation like "Bob is 18 years old" with the full representation in CCO. There are 11 RDF triples here:
The point here is that a lot of these details aren't necessary for common use cases of age data.
The annotation property
cco:is_tokenized_byprovides one kind of shortcut in cases where we need not model extra details about information bearers, such as their provenance. However, some of these details such as units need be added to the token (ex. "10 years"). Worse, the annotation property can't be used with logical axioms. For example, I can't express that someone must be over 18 years old to vote using the "is_tokenized_by" annotation. But it's still useful as a shortcut.This graph shows what all of this looks like and how many "hops" are involved. I added more classes for context:
graph IBE[Information Bearing Entity]:::type -.-> DocumentField DocumentField:::type -.-> BobsAgeInYears{Bob's Age\nIn Years}:::instance Process:::type -.-> BobsLife{Bob's Life}:::instance MYTI[MultiYearTemporalInterval]:::type -.-> BobsLifetime{Bob's Lifetime}:::instance 1DTR[1-Dimensional\nTemporal Region]:::type -.-> MYTI:::type Person:::type -.-> Bob{Bob}:::instance MICE[Measurement Information\nContent Entity]:::type -.-> BobsAgeMeasurement{Bob's Age\nMeasurement}:::instance YearMeasurementUnit{Year Measurement Unit}:::instance AgeValue{{18}}:::datatype %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% BobsAgeMeasurement --> |"is tokenized by (shortcut)"| AgeValue Bob -->|participates in| BobsLife BobsLifetime -->|temporal region of| BobsLife %%% BobsLifetime --> |process started by| BobsBirth BobsAgeMeasurement -->|measurement of| BobsLifetime BobsAgeInYears -->|carrier of| BobsAgeMeasurement BobsAgeInYears -->|uses measurement unit| YearMeasurementUnit BobsAgeInYears -->|has integer value| AgeValue classDef instance fill:#914585, color: white; classDef type fill:#d6a500, color:white; classDef datatype fill:#bb2f42, color: white; linkStyle default stroke:#0079c0 linkStyle 7 stroke:#cb6b00 classDef dataProperty fill:#00a53cSuggested shortcuts
With the age example, I find it more useful to model the information bearer than the information content. Information content about age is only useful if we want to represent, for example, that the phrase "18 years" means the same as "216 months" -- i.e. they bear/carry the same information content. Otherwise, we typically want to just reason about the integers and units associated information bearers. In which case, the content is metaphysical baggage.
A property chain axiom attached to something like "carries information about" could provide a shortcut between a carrier of information and what that information is about. Similarly, we could make more specific subproperties like "carries measurement of" which would fit the example better.
:carries_information_about rdf:type owl:ObjectProperty ; owl:inverseOf :is_subject_of_carrier ; rdfs:domain bfo:0000004 ; # independent continuant rdfs:range bfo:0000001 ; # entity owl:propertyChainAxiom ( bfo:0000101 # is carrier of (at some time) cco:is_about ) .This would only be useful in reasoning if we already had a complex chain of relations as in the both example, which would entail the shortcut, but not the other way around. (To get the opposite of a shortcut, something like a SPARQL construct could be used for expanding these shortcut relations into the complex chain of relations, involving the information contents and bearers, etc.)
In any case, the shortcut is at least somewhat interoperable with the full CCO representation.
Using this in the previous example simplifies things a bit:
graph IBE[Information Bearing Entity]:::type -.-> DocumentField DocumentField:::type -.-> BobsAgeInYears{Bob's Age\nIn Years}:::instance Process:::type -.-> BobsLife{Bob's Life}:::instance MYTI[MultiYearTemporalInterval]:::type -.-> BobsLifetime{Bob's Lifetime}:::instance 1DTR[1-Dimensional\nTemporal Region]:::type -.-> MYTI:::type Person:::type -.-> Bob{Bob}:::instance YearMeasurementUnit{Year Measurement Unit}:::instance AgeValue{{18}}:::datatype Bob -->|participates in| BobsLife BobsLifetime -->|is temporal region of| BobsLife BobsAgeInYears -->|uses measurement unit| YearMeasurementUnit BobsAgeInYears --> |"carries information about (shortcut)"| BobsLifetime BobsAgeInYears -->|has integer value| AgeValue classDef instance fill:#914585, color: white; classDef type fill:#d6a500, color:white; classDef datatype fill:#bb2f42, color: white; classDef objectProperty fill:#0079c0 classDef dataProperty fill:#00a53c linkStyle default stroke:#0079c0Simplifying further, we get something that looks a little more familiar to someone working with more common data formats. This is probably want we want unless we need to reason about someone's life or the temporal region it occurs on. More likely, we just need to reason about the integer value associated with their age, and associate it to them.
We could use
cco:is_subject_of_fieldas a more specific subproperty of the inverse and use an anonymous node to get something that looks even more familiar. Now we have 5 triples instead of 11.Please let me know if something is missing here or I'm misunderstanding anything.