Re: (Lost in the noise perhaps - so asking again) - Is a trailing slash 'better' than a trailing hash for vocabs namespace IRIs? from Antoine Zimmermann on 2022-11-16 (semantic-web@w3.org from November 2022)

From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Date: Wed, 16 Nov 2022 12:07:47 +0100
To: semantic-web@w3.org
Message-ID: <0a1cf00a-906c-42de-c678-a9fe9734fd52@emse.fr>
I updated the wiki page a bit with links to discussions and documents 
related to the topic.

--AZ

Le 16/11/2022 à 03:31, David Booth a écrit :
> On 11/15/22 10:53, Antoine Zimmermann wrote:
>> I could very well edit the [wiki] page that you link to and say that 
>> hash IRIs are not OK, then W3C wiki would suggest that *not* both are OK.
> 
> Please do, but inclusively and respectfully, and with the explanations 
> that you gave.  That's what a wiki is for!
> 
> Thanks,
> David Booth
> 
>>
>> --AZ
>>
>> Le 15/11/2022 à 16:32, Martynas Jusevičius a écrit :
>>> W3C wiki suggests that both approaches are OK:
>>> https://www.w3.org/wiki/HashVsSlash
>>>
>>> On Tue, Nov 15, 2022 at 4:22 PM Antoine Zimmermann
>>> <antoine.zimmermann@emse.fr> wrote:
>>>>
>>>> Dear semwebbers,
>>>>
>>>>
>>>> Sorry to follow up on this conversation a little late but I'd like to
>>>> add a few things, hopefully worth more than 2 cents.
>>>>
>>>> Overall, +1 to recommend, as a best practice, using a slash-based
>>>> namespace for vocabs.
>>>>
>>>> A few comments regarding:
>>>>    1. number of HTTP lookups
>>>>    2. simplicity
>>>>    3. "ontology terms don't mean much outside the context of the whole
>>>> ontology"
>>>>    4. Using hashes for other things
>>>>    5. httpRange14
>>>>    6. modularisation
>>>>
>>>> Regarding 1., real life experiments would have to be made because there
>>>> are good reasons to think that, from a network perspective as a whole,
>>>> slash IRIs are not an issue at all. In most cases, applications know
>>>> what term to look for in data, and already know the ontologies that
>>>> correspond to those terms. Only very rarely would an application crawl
>>>> the Web from data documents to term documents to ontology documents.
>>>> This would be very inefficient in most cases, and telling the world to
>>>> use slashes rather than hashes (or single-term documentation rather 
>>>> than
>>>> full ontology documentation) would only marginally affect this
>>>> inefficiency, if at all (IMHO). With hash-based namespaces, there is
>>>> also a potential for inefficient use of network, e.g. when caching is
>>>> not possible for some reason.
>>>>
>>>> Regarding 2., yes, hash-based namespaces are simpler to setup and
>>>> publish. But they are difficult to work with in the long term. In
>>>> professional projects, there are tons of things that are cumbersome if
>>>> applied to simple personal tasks, such as setting up a version control
>>>> system for every piece of code or document one is authoring; or 
>>>> applying
>>>> full-fledged collaborative development methodology for your hobby short
>>>> novel writing. The burden of setting up a server with proper URI
>>>> redirection is minuscule if you think of ontology development as a type
>>>> of professional software project. It seems to me that justifying
>>>> hash-based namespaces based on its simplicity is aiming at the lowest
>>>> possible quality requirement.
>>>>
>>>> Regarding 3., why would this be a problem for ontologies and not for
>>>> other kinds of linked data or knowledge graphs? In object oriented
>>>> software development, a single class does not mean anything in
>>>> isolation, yet most often each class is defined in a separate file. If
>>>> you access these files on the Web (say, via Github), you don't have the
>>>> context outside the class, and your class cannot even function without
>>>> the other classes it relates to. Why wouldn't it be a problem too if it
>>>> is a problem for ontologies? But in reality, it is not a problem 
>>>> because
>>>> you can always download the whole package, the same as you can download
>>>> the whole ontology from its ontology IRI. It should be rather easy 
>>>> to be
>>>> directed to the whole ontology file when necessary, and yet allow 
>>>> one to
>>>> simply get a documentation of a single term.
>>>>
>>>> Regarding 4., with slash-based namespaces, hashes can be used for other
>>>> useful things. E.g., there could be a fragment of a term specification
>>>> that provides usage examples like http://onto.org/Term#example (this is
>>>> done in schema.org). There could be a section about history (e.g. when
>>>> the term was added to the vocab, version info about the term itself
>>>> http://onto.org/Term#history) or metadata (who created the term
>>>> http://onto.org/Term#meta, related Github issues and discussions, 
>>>> etc.).
>>>>
>>>> Regarding 5., if a term like http://myonto.org/Person denotes a 
>>>> class of
>>>> people, then it certainly isn't an information resource. However, if 
>>>> GET
>>>> http://myonto.org/Person responds with a 200 OK, then, by httpRange14
>>>> resolution, the IRI must denote an information resource. A solution is
>>>> to redirect to another IRI, say http://myonto.org/doc/Person, but this
>>>> means yet another HTTP lookup. Instead, one could use
>>>> http://myonto.org/Person# as an identifier for the term, and
>>>> http://myonto.org/Person as an identifier for the RDF document that
>>>> defines the term. Then it's using the best of both worlds: a 
>>>> slash-based
>>>> namespace with a hash IRI.
>>>>
>>>> Regarding 6., with slash IRIs, the ontology can be modularised while
>>>> preserving a single namespace. There can be modules
>>>> http://myonto.org/module1 and http://myonto.org/module2 that provide
>>>> each a distinct ontology that use the same namespace http://myonto.org/
>>>> for all terms, and, assuming a single slash-based namespace ont:
>>>>
>>>> #In ont:Term1 file:
>>>> ont:Term1 rdfs:isDefinedBy ont:module1 .
>>>>
>>>> #In ont:Term2 file:
>>>> ont:Term2 rdfs:isDefinedBy ont:module2 .
>>>>
>>>> It is also possible to redirect ont:Term1 to module1, and ont:Term2 to
>>>> module2, if the ontology owner prefers to serve the whole module
>>>> instead. Then there can be a global ontology document:
>>>>
>>>> ont: a owl:Ontology;
>>>>    owl:imports ont:module1, ont:module2 .
>>>>
>>>> This last option was an idea by my colleague Maxime Lefrançois who
>>>> implemented it in the Smart Energy Aware Systems ontology:
>>>> https://w3id.org/seas/
>>>>
>>>>
>>>> Given the many advantages I see, with tiny drawbacks, I can't 
>>>> understand
>>>> how not recommending slash-based namespaces for vocabs be a tenable
>>>> position.
>>>>
>>>>
>>>> Best,
>>>> --AZ
>>>>
>>>> Le 06/10/2022 à 16:10, Pat McBennett a écrit :
>>>>> So (I think!) I know all the pro's and con's of using either a 
>>>>> trailing
>>>>> slash or a trailing hash for vocab namespace IRIs. Basically it 
>>>>> boils down
>>>>> to hashes meaning you'll always get info on all the terms in a 
>>>>> vocabulary,
>>>>> even if you only want info for one specific term, whereas using a 
>>>>> slash
>>>>> means I can always get just the info for any specific, individual 
>>>>> term I
>>>>> request.
>>>>>
>>>>> Note: using slashes provides the ability to get the best of both 
>>>>> worlds -
>>>>> i.e., small responses when explicitly asking for info on just one 
>>>>> term, but
>>>>> if you want info for all the terms in one HTTP response, then just 
>>>>> serve up
>>>>> that complete vocab response when the base namespace IRI itself is
>>>>> dereferenced.
>>>>>
>>>>> Here's a nice simple illustration of the basic difference:
>>>>> - Slash: QUDT's 'CurrencyUnit' term (i.e., click on '
>>>>> https://qudt.org/schema/qudt/CurrencyUnit') and you get a nice clean,
>>>>> concise, and precise set of info on just the one term you asked for -
>>>>> lovely!
>>>>>
>>>>> - Hash: DPV's 'JointDataControllers' (i.e., click on '
>>>>> https://w3id.org/dpv#JointDataControllers') and you get bombarded 
>>>>> with a
>>>>> huge document, with a daunting Table of Contents on the left, and 
>>>>> info on
>>>>> hundreds of other terms that I didn't ask for, and so had no 
>>>>> interest in
>>>>> whatsoever (don't get me wrong - this is fantastically detailed and
>>>>> potentially very useful information, but it's simply not what I 
>>>>> asked for!).
>>>>>
>>>>> So based on the greater flexibility and future-proofing of using slash
>>>>> (i.e., it offers the best of both worlds, whereas hash is forever 
>>>>> limited),
>>>>> I've become firmly of the opinion that slashes are just 'better' that
>>>>> hashes, and in fact are simply 'more correct' (i.e., all IRIs 
>>>>> should be
>>>>> uniquely dereferencable).
>>>>>
>>>>> I also think the distinction is critically important when creating
>>>>> vocabularies intended for widespread and long-lasting use (such as 
>>>>> the DPV
>>>>> vocab above). For throw-away or pet projects, sure, it doesn't really
>>>>> matter (yet even then, I still think slashes are the 'more correct' 
>>>>> option).
>>>>>
>>>>> I know that the convention from the W3C has tended to be to use 
>>>>> hashes, but
>>>>> I think in hindsight that was a mistake, and that the advice from the
>>>>> Semantic Web community as a whole should now be to adopt slashes
>>>>> consistently for all new vocabularies. (And it's not like using 
>>>>> slash has
>>>>> no precedent - major 'authoritative' vocabs like QUDT, Schema.org, 
>>>>> gist,
>>>>> SOSA, SSN, (even the venerable FOAF!) all use slash).
>>>>>
>>>>> I'd love to hear this group's thoughts. (For reference, I did ask 
>>>>> the gist
>>>>> community if they recorded their discussions around their decision (in
>>>>> 2019) to formally switch gist from hash to slash (here
>>>>> <https://github.com/semanticarts/gist/issues/725>), but it seems they
>>>>> weren't recorded, and I've also raised the issue with the DPV group
>>>>> directly too (here <https://github.com/w3c/dpv/issues/53>)).
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Pat.
>>>>>
>>>>> *Pat McBennett*, Technical Architect
>>>>>
>>>>> Contact  | patm@inrupt.com
>>>>>
>>>>> Connect | WebID <http://pmcb55.inrupt.net/profile/card#me>, GitHub
>>>>> <https://github.com/pmcb55>
>>>>>
>>>>> Explore  | www.inrupt.com
>>>>>
>>>>
>>>> -- 
>>>> Antoine Zimmermann
>>>> École des Mines de Saint-Étienne
>>>> 158 cours Fauriel
>>>> CS 62362
>>>> 42023 Saint-Étienne Cedex 2
>>>> France
>>>> Tél:+33(0)4 77 49 97 02
>>>> http://www.emse.fr/~zimmermann/
>>>>
>>>
>>
> 

-- 
Antoine Zimmermann
École des Mines de Saint-Étienne
158 cours Fauriel
CS 62362
42023 Saint-Étienne Cedex 2
France
Tél:+33(0)4 77 49 97 02
http://www.emse.fr/~zimmermann/
Received on Wednesday, 16 November 2022 11:08:37 UTC