Reading @lb42 ’s question about prefixDef on TEI-L, I remembered a lengthy discussion about this and found that “prefixDef” actually is somewhat of a misnomer which, in this discussion, lead to quite a bit of confusion. The other party in the discussion, from the term “prefixDef” took that it was meant to designate a namespace prefix.
Of course, we are not talking about a prefix but about a locally defined, private URI scheme. The GL, though, are less specific which created the confusion: section 16.2.3 initially uses the term “private URI scheme” or simply “scheme”, but the short definitions of the two elements introduce the term “prefixing scheme” – possibly trying to reflect prefixDef – and the prose section mixes “prefix” in quite frequently.
The definitions and prose should not include the word “prefix” when referring to the scheme as RFC 3986 does not use the term “prefix” in a well-defined manner. Hence, we should stick to what is well-defined in the RFC. which is the term “scheme”.
It should be made explicit that in the GL’s example of "psn:fred", “psn” is a URI scheme and “fred” a URI path.
While I don’t suggest we rename prefixDef/listPrefixDef (maybe keep that for P6 😉), we should adjust the wording of their short definition to use “private URI scheme” instead of “prefixing scheme”.
--
Another problem comes up when looking at an expansion of the example I gave on TEI-L, which was of the form
which may have 2 expansions, one for a (project- or repository-)local resolver and one for the defining authorities endpoint (in this example, Gemeinsame Normdatei by the German national library, so the expanded URL would be https://d-nb.de/gnd/1234567).
Now this is easy to resolve but has one logic problem: “gnd”, strictly speaking, is not a scheme but actually an authority for which there is a dedicated place in a URI:
per being the (private) scheme denoting the type of entity;
gnd being the authority;
1234567 being the path, here the identifier within the authority.
This follows RFC 3986 even more closely than a simple “gnd:1234567” would do as “gnd” indeed is the authority defining the identifier given in the path component. But it has a downside, too, which is parseability into the 2 different expansions (one being a local expansion that returns information based on the string “gnd:1234567”, the other being the authority’s URL of ).
Expansion 1 is easy to achieve from both short and long versions, but the second, generic expansion will not be possible:
The pefixDef mechanism treats everything after the (first) colon as being the same thing while actually RFC 3986 sees 2 different components here. I don’t think we urgently need to change anything here but we should mention this in the GL.
--
At this point, however, we come back to @lb42 ’s question of providing some form of central registry or place of dealing with private URIs. RFC 3986 (page 19f.) actually allows for URI schemes to require a specific way of resolving (the registered name making up the host part of) the authority component of the URI.
So there actually is a basis for providing a lookup though that might include some work to enable the prefixDef mechanism to distinguish between the authority and path components and actually hint at resolution.
--
Sorry for this long one but I'm sure we'll have a fruitful discussion possibly leading to a few improvements in the GL.
Reading @lb42 ’s question about
prefixDefon TEI-L, I remembered a lengthy discussion about this and found that “prefixDef” actually is somewhat of a misnomer which, in this discussion, lead to quite a bit of confusion. The other party in the discussion, from the term “prefixDef” took that it was meant to designate a namespace prefix.Of course, we are not talking about a prefix but about a locally defined, private URI scheme. The GL, though, are less specific which created the confusion: section 16.2.3 initially uses the term “private URI scheme” or simply “scheme”, but the short definitions of the two elements introduce the term “prefixing scheme” – possibly trying to reflect prefixDef – and the prose section mixes “prefix” in quite frequently.
The definitions and prose should not include the word “prefix” when referring to the scheme as RFC 3986 does not use the term “prefix” in a well-defined manner. Hence, we should stick to what is well-defined in the RFC. which is the term “scheme”.
It should be made explicit that in the GL’s example of "psn:fred", “psn” is a URI scheme and “fred” a URI path.
While I don’t suggest we rename prefixDef/listPrefixDef (maybe keep that for P6 😉), we should adjust the wording of their short definition to use “private URI scheme” instead of “prefixing scheme”.
--
Another problem comes up when looking at an expansion of the example I gave on TEI-L, which was of the form
which may have 2 expansions, one for a (project- or repository-)local resolver and one for the defining authorities endpoint (in this example, Gemeinsame Normdatei by the German national library, so the expanded URL would be https://d-nb.de/gnd/1234567).
Now this is easy to resolve but has one logic problem: “gnd”, strictly speaking, is not a scheme but actually an authority for which there is a dedicated place in a URI:
perbeing the (private) scheme denoting the type of entity;gndbeing the authority;1234567being the path, here the identifier within the authority.This follows RFC 3986 even more closely than a simple “gnd:1234567” would do as “gnd” indeed is the authority defining the identifier given in the path component. But it has a downside, too, which is parseability into the 2 different expansions (one being a local expansion that returns information based on the string “gnd:1234567”, the other being the authority’s URL of ).
Expansion 1 is easy to achieve from both short and long versions, but the second, generic expansion will not be possible:
The
pefixDefmechanism treats everything after the (first) colon as being the same thing while actually RFC 3986 sees 2 different components here. I don’t think we urgently need to change anything here but we should mention this in the GL.--
At this point, however, we come back to @lb42 ’s question of providing some form of central registry or place of dealing with private URIs. RFC 3986 (page 19f.) actually allows for URI schemes to require a specific way of resolving (the registered name making up the host part of) the authority component of the URI.
So there actually is a basis for providing a lookup though that might include some work to enable the
prefixDefmechanism to distinguish between the authority and path components and actually hint at resolution.--
Sorry for this long one but I'm sure we'll have a fruitful discussion possibly leading to a few improvements in the GL.