URI scheme advice for an RDF schema from john.walker on 2014-07-23 (public-lod@w3.org from July 2014)

From: john.walker <john.walker@semaku.com>
Date: Wed, 23 Jul 2014 13:22:48 +0200 (CEST)
To: public-lod@w3.org
Message-ID: <1169571599.976185.1406114568492.open-xchange@oxweb03.eigbox.net>
Hi There,
 
There is plenty of advice/help out there regarding URI schemes for instance
data, for example the EC study on persistent URIs [1].
 
I was wondering if there are any similar studies or guidelines about URI schemes
for RDF schema (using this as catch all term for vocabulary, data dictionary,
schema, ontology).
 
The particular use case I have is a ISO 13584 compliant data dictionary with a
few hundred classes and over 1000 properties which I'd like to convert to RDF.
Everything in the dictionary (including the dictionary itself) is identified
with an IRDI [2].
 
Points to consider:

1. (I'll get this one out of the way first :) ) Hash vs. slash URIs: What's the
latest advice/pros/cons? Currently I am leaning towards slash URIs so the user
is not forced to download the entire schema in one file (of course we can always
provide a dump for those who want it). Any best practices here?

2. URN or HTTP URI: A URN scheme for IRDIs has previously been mooted, but seems
a distinct lack of progress. Following linked data principles I was planning to
use HTTP URIs instead. Would there be any advantage to use URNs instead?

3. Human-readable URIs: Many widely used schema (e.g. Schema.org, FOAF) have a
human-readable component in the URI, typically a URI-friendly version of the
label. I can see this makes things a lot easier for human consumers when reading
raw Turtle or writing a SPARQL query. However the labels are subject to change
over time, are in multiple languages and are not unique. It is simple to define
a mapping from IRDI to URI, but this does not give a meaningful URI (e.g.
http://example.com/myDictionary/c_abc123), but would guarantee uniqueness and
persistence. Given the opacity axiom [3] does this really matter? I could
imagine that one could allow the editor of the dictionary to define slugs that
would be to build the URI rather than generating from the IRDI. These could be
optional and you might only define such a slug for the most commonly used terms.
Alternatively one could define these as aliases with additional statements
defining some equivalence links (perhaps using owl:sameAs, owl:equivalentClass
and owl:equivalentProperty).

<http://example.com/myDictionary/c_abc123> owl:equivalentClass
<http://example.com/myDictionary/Person> .

Has anyone ever tried such an approach?

4. Versioning: The IRDI includes a version identifier where there are clearly
defined rules about what type of change can be done within a version (e.g.
editorial changes), what can be done as a version change (e.g. upward-compatible
change) and what requires a new identifier (breaking change). I was thinking to
exclude this version identifier from the URI, but perhaps (if needed) expose the
different versions/states of the resource using Memento [4]. Any experiences
with using such an approach?

5. Serving representations: Maybe this is a moot point, but I would consider the
'things' described in the dictionary to be abstract entities and, as such, to
give a 303 response if used with slash URIs. The response would then include a
redirect to the information resource that would use conneg to serve the
different representations/states of that resource. However I do not see this
practice widely used for other RDF schemas. Any reason why?
 
[1] http://philarcher.org/diary/2013/uripersistence/
[2] http://wiki.eclass.eu/wiki/IRDI
[3] http://www.w3.org/DesignIssues/Axioms.html#opaque
[4] http://mementoweb.org/
 
Regards,

John Walker
Received on Wednesday, 23 July 2014 11:23:10 UTC