Re: (Lost in the noise perhaps - so asking again) - Is a trailing slash 'better' than a trailing hash for vocabs namespace IRIs? from David Booth on 2022-11-16 (semantic-web@w3.org from November 2022)

From: David Booth <david@dbooth.org>
Date: Tue, 15 Nov 2022 21:31:15 -0500
To: semantic-web@w3.org
Message-ID: <8a9cf20f-0910-2f34-8ddd-e1a4b485b77d@dbooth.org>
On 11/15/22 10:53, Antoine Zimmermann wrote:
> I could very well edit the [wiki] page that you link to and say that 
> hash IRIs are not OK, then W3C wiki would suggest that *not* both are OK.

Please do, but inclusively and respectfully, and with the explanations 
that you gave.  That's what a wiki is for!

Thanks,
David Booth

> 
> --AZ
> 
> Le 15/11/2022 à 16:32, Martynas Jusevičius a écrit :
>> W3C wiki suggests that both approaches are OK:
>> https://www.w3.org/wiki/HashVsSlash
>>
>> On Tue, Nov 15, 2022 at 4:22 PM Antoine Zimmermann
>> <antoine.zimmermann@emse.fr> wrote:
>>>
>>> Dear semwebbers,
>>>
>>>
>>> Sorry to follow up on this conversation a little late but I'd like to
>>> add a few things, hopefully worth more than 2 cents.
>>>
>>> Overall, +1 to recommend, as a best practice, using a slash-based
>>> namespace for vocabs.
>>>
>>> A few comments regarding:
>>>    1. number of HTTP lookups
>>>    2. simplicity
>>>    3. "ontology terms don't mean much outside the context of the whole
>>> ontology"
>>>    4. Using hashes for other things
>>>    5. httpRange14
>>>    6. modularisation
>>>
>>> Regarding 1., real life experiments would have to be made because there
>>> are good reasons to think that, from a network perspective as a whole,
>>> slash IRIs are not an issue at all. In most cases, applications know
>>> what term to look for in data, and already know the ontologies that
>>> correspond to those terms. Only very rarely would an application crawl
>>> the Web from data documents to term documents to ontology documents.
>>> This would be very inefficient in most cases, and telling the world to
>>> use slashes rather than hashes (or single-term documentation rather than
>>> full ontology documentation) would only marginally affect this
>>> inefficiency, if at all (IMHO). With hash-based namespaces, there is
>>> also a potential for inefficient use of network, e.g. when caching is
>>> not possible for some reason.
>>>
>>> Regarding 2., yes, hash-based namespaces are simpler to setup and
>>> publish. But they are difficult to work with in the long term. In
>>> professional projects, there are tons of things that are cumbersome if
>>> applied to simple personal tasks, such as setting up a version control
>>> system for every piece of code or document one is authoring; or applying
>>> full-fledged collaborative development methodology for your hobby short
>>> novel writing. The burden of setting up a server with proper URI
>>> redirection is minuscule if you think of ontology development as a type
>>> of professional software project. It seems to me that justifying
>>> hash-based namespaces based on its simplicity is aiming at the lowest
>>> possible quality requirement.
>>>
>>> Regarding 3., why would this be a problem for ontologies and not for
>>> other kinds of linked data or knowledge graphs? In object oriented
>>> software development, a single class does not mean anything in
>>> isolation, yet most often each class is defined in a separate file. If
>>> you access these files on the Web (say, via Github), you don't have the
>>> context outside the class, and your class cannot even function without
>>> the other classes it relates to. Why wouldn't it be a problem too if it
>>> is a problem for ontologies? But in reality, it is not a problem because
>>> you can always download the whole package, the same as you can download
>>> the whole ontology from its ontology IRI. It should be rather easy to be
>>> directed to the whole ontology file when necessary, and yet allow one to
>>> simply get a documentation of a single term.
>>>
>>> Regarding 4., with slash-based namespaces, hashes can be used for other
>>> useful things. E.g., there could be a fragment of a term specification
>>> that provides usage examples like http://onto.org/Term#example (this is
>>> done in schema.org). There could be a section about history (e.g. when
>>> the term was added to the vocab, version info about the term itself
>>> http://onto.org/Term#history) or metadata (who created the term
>>> http://onto.org/Term#meta, related Github issues and discussions, etc.).
>>>
>>> Regarding 5., if a term like http://myonto.org/Person denotes a class of
>>> people, then it certainly isn't an information resource. However, if GET
>>> http://myonto.org/Person responds with a 200 OK, then, by httpRange14
>>> resolution, the IRI must denote an information resource. A solution is
>>> to redirect to another IRI, say http://myonto.org/doc/Person, but this
>>> means yet another HTTP lookup. Instead, one could use
>>> http://myonto.org/Person# as an identifier for the term, and
>>> http://myonto.org/Person as an identifier for the RDF document that
>>> defines the term. Then it's using the best of both worlds: a slash-based
>>> namespace with a hash IRI.
>>>
>>> Regarding 6., with slash IRIs, the ontology can be modularised while
>>> preserving a single namespace. There can be modules
>>> http://myonto.org/module1 and http://myonto.org/module2 that provide
>>> each a distinct ontology that use the same namespace http://myonto.org/
>>> for all terms, and, assuming a single slash-based namespace ont:
>>>
>>> #In ont:Term1 file:
>>> ont:Term1 rdfs:isDefinedBy ont:module1 .
>>>
>>> #In ont:Term2 file:
>>> ont:Term2 rdfs:isDefinedBy ont:module2 .
>>>
>>> It is also possible to redirect ont:Term1 to module1, and ont:Term2 to
>>> module2, if the ontology owner prefers to serve the whole module
>>> instead. Then there can be a global ontology document:
>>>
>>> ont: a owl:Ontology;
>>>    owl:imports ont:module1, ont:module2 .
>>>
>>> This last option was an idea by my colleague Maxime Lefrançois who
>>> implemented it in the Smart Energy Aware Systems ontology:
>>> https://w3id.org/seas/
>>>
>>>
>>> Given the many advantages I see, with tiny drawbacks, I can't understand
>>> how not recommending slash-based namespaces for vocabs be a tenable
>>> position.
>>>
>>>
>>> Best,
>>> --AZ
>>>
>>> Le 06/10/2022 à 16:10, Pat McBennett a écrit :
>>>> So (I think!) I know all the pro's and con's of using either a trailing
>>>> slash or a trailing hash for vocab namespace IRIs. Basically it 
>>>> boils down
>>>> to hashes meaning you'll always get info on all the terms in a 
>>>> vocabulary,
>>>> even if you only want info for one specific term, whereas using a slash
>>>> means I can always get just the info for any specific, individual 
>>>> term I
>>>> request.
>>>>
>>>> Note: using slashes provides the ability to get the best of both 
>>>> worlds -
>>>> i.e., small responses when explicitly asking for info on just one 
>>>> term, but
>>>> if you want info for all the terms in one HTTP response, then just 
>>>> serve up
>>>> that complete vocab response when the base namespace IRI itself is
>>>> dereferenced.
>>>>
>>>> Here's a nice simple illustration of the basic difference:
>>>> - Slash: QUDT's 'CurrencyUnit' term (i.e., click on '
>>>> https://qudt.org/schema/qudt/CurrencyUnit') and you get a nice clean,
>>>> concise, and precise set of info on just the one term you asked for -
>>>> lovely!
>>>>
>>>> - Hash: DPV's 'JointDataControllers' (i.e., click on '
>>>> https://w3id.org/dpv#JointDataControllers') and you get bombarded 
>>>> with a
>>>> huge document, with a daunting Table of Contents on the left, and 
>>>> info on
>>>> hundreds of other terms that I didn't ask for, and so had no 
>>>> interest in
>>>> whatsoever (don't get me wrong - this is fantastically detailed and
>>>> potentially very useful information, but it's simply not what I 
>>>> asked for!).
>>>>
>>>> So based on the greater flexibility and future-proofing of using slash
>>>> (i.e., it offers the best of both worlds, whereas hash is forever 
>>>> limited),
>>>> I've become firmly of the opinion that slashes are just 'better' that
>>>> hashes, and in fact are simply 'more correct' (i.e., all IRIs should be
>>>> uniquely dereferencable).
>>>>
>>>> I also think the distinction is critically important when creating
>>>> vocabularies intended for widespread and long-lasting use (such as 
>>>> the DPV
>>>> vocab above). For throw-away or pet projects, sure, it doesn't really
>>>> matter (yet even then, I still think slashes are the 'more correct' 
>>>> option).
>>>>
>>>> I know that the convention from the W3C has tended to be to use 
>>>> hashes, but
>>>> I think in hindsight that was a mistake, and that the advice from the
>>>> Semantic Web community as a whole should now be to adopt slashes
>>>> consistently for all new vocabularies. (And it's not like using 
>>>> slash has
>>>> no precedent - major 'authoritative' vocabs like QUDT, Schema.org, 
>>>> gist,
>>>> SOSA, SSN, (even the venerable FOAF!) all use slash).
>>>>
>>>> I'd love to hear this group's thoughts. (For reference, I did ask 
>>>> the gist
>>>> community if they recorded their discussions around their decision (in
>>>> 2019) to formally switch gist from hash to slash (here
>>>> <https://github.com/semanticarts/gist/issues/725>), but it seems they
>>>> weren't recorded, and I've also raised the issue with the DPV group
>>>> directly too (here <https://github.com/w3c/dpv/issues/53>)).
>>>>
>>>> Cheers,
>>>>
>>>> Pat.
>>>>
>>>> *Pat McBennett*, Technical Architect
>>>>
>>>> Contact  | patm@inrupt.com
>>>>
>>>> Connect | WebID <http://pmcb55.inrupt.net/profile/card#me>, GitHub
>>>> <https://github.com/pmcb55>
>>>>
>>>> Explore  | www.inrupt.com
>>>>
>>>
>>> -- 
>>> Antoine Zimmermann
>>> École des Mines de Saint-Étienne
>>> 158 cours Fauriel
>>> CS 62362
>>> 42023 Saint-Étienne Cedex 2
>>> France
>>> Tél:+33(0)4 77 49 97 02
>>> http://www.emse.fr/~zimmermann/
>>>
>>
>
Received on Wednesday, 16 November 2022 02:31:33 UTC