- From: Martynas Jusevičius <martynas@atomgraph.com>
- Date: Tue, 15 Nov 2022 16:32:30 +0100
- To: Antoine Zimmermann <antoine.zimmermann@emse.fr>
- Cc: semantic-web@w3.org, patm@inrupt.com
W3C wiki suggests that both approaches are OK: https://www.w3.org/wiki/HashVsSlash On Tue, Nov 15, 2022 at 4:22 PM Antoine Zimmermann <antoine.zimmermann@emse.fr> wrote: > > Dear semwebbers, > > > Sorry to follow up on this conversation a little late but I'd like to > add a few things, hopefully worth more than 2 cents. > > Overall, +1 to recommend, as a best practice, using a slash-based > namespace for vocabs. > > A few comments regarding: > 1. number of HTTP lookups > 2. simplicity > 3. "ontology terms don't mean much outside the context of the whole > ontology" > 4. Using hashes for other things > 5. httpRange14 > 6. modularisation > > Regarding 1., real life experiments would have to be made because there > are good reasons to think that, from a network perspective as a whole, > slash IRIs are not an issue at all. In most cases, applications know > what term to look for in data, and already know the ontologies that > correspond to those terms. Only very rarely would an application crawl > the Web from data documents to term documents to ontology documents. > This would be very inefficient in most cases, and telling the world to > use slashes rather than hashes (or single-term documentation rather than > full ontology documentation) would only marginally affect this > inefficiency, if at all (IMHO). With hash-based namespaces, there is > also a potential for inefficient use of network, e.g. when caching is > not possible for some reason. > > Regarding 2., yes, hash-based namespaces are simpler to setup and > publish. But they are difficult to work with in the long term. In > professional projects, there are tons of things that are cumbersome if > applied to simple personal tasks, such as setting up a version control > system for every piece of code or document one is authoring; or applying > full-fledged collaborative development methodology for your hobby short > novel writing. The burden of setting up a server with proper URI > redirection is minuscule if you think of ontology development as a type > of professional software project. It seems to me that justifying > hash-based namespaces based on its simplicity is aiming at the lowest > possible quality requirement. > > Regarding 3., why would this be a problem for ontologies and not for > other kinds of linked data or knowledge graphs? In object oriented > software development, a single class does not mean anything in > isolation, yet most often each class is defined in a separate file. If > you access these files on the Web (say, via Github), you don't have the > context outside the class, and your class cannot even function without > the other classes it relates to. Why wouldn't it be a problem too if it > is a problem for ontologies? But in reality, it is not a problem because > you can always download the whole package, the same as you can download > the whole ontology from its ontology IRI. It should be rather easy to be > directed to the whole ontology file when necessary, and yet allow one to > simply get a documentation of a single term. > > Regarding 4., with slash-based namespaces, hashes can be used for other > useful things. E.g., there could be a fragment of a term specification > that provides usage examples like http://onto.org/Term#example (this is > done in schema.org). There could be a section about history (e.g. when > the term was added to the vocab, version info about the term itself > http://onto.org/Term#history) or metadata (who created the term > http://onto.org/Term#meta, related Github issues and discussions, etc.). > > Regarding 5., if a term like http://myonto.org/Person denotes a class of > people, then it certainly isn't an information resource. However, if GET > http://myonto.org/Person responds with a 200 OK, then, by httpRange14 > resolution, the IRI must denote an information resource. A solution is > to redirect to another IRI, say http://myonto.org/doc/Person, but this > means yet another HTTP lookup. Instead, one could use > http://myonto.org/Person# as an identifier for the term, and > http://myonto.org/Person as an identifier for the RDF document that > defines the term. Then it's using the best of both worlds: a slash-based > namespace with a hash IRI. > > Regarding 6., with slash IRIs, the ontology can be modularised while > preserving a single namespace. There can be modules > http://myonto.org/module1 and http://myonto.org/module2 that provide > each a distinct ontology that use the same namespace http://myonto.org/ > for all terms, and, assuming a single slash-based namespace ont: > > #In ont:Term1 file: > ont:Term1 rdfs:isDefinedBy ont:module1 . > > #In ont:Term2 file: > ont:Term2 rdfs:isDefinedBy ont:module2 . > > It is also possible to redirect ont:Term1 to module1, and ont:Term2 to > module2, if the ontology owner prefers to serve the whole module > instead. Then there can be a global ontology document: > > ont: a owl:Ontology; > owl:imports ont:module1, ont:module2 . > > This last option was an idea by my colleague Maxime Lefrançois who > implemented it in the Smart Energy Aware Systems ontology: > https://w3id.org/seas/ > > > Given the many advantages I see, with tiny drawbacks, I can't understand > how not recommending slash-based namespaces for vocabs be a tenable > position. > > > Best, > --AZ > > Le 06/10/2022 à 16:10, Pat McBennett a écrit : > > So (I think!) I know all the pro's and con's of using either a trailing > > slash or a trailing hash for vocab namespace IRIs. Basically it boils down > > to hashes meaning you'll always get info on all the terms in a vocabulary, > > even if you only want info for one specific term, whereas using a slash > > means I can always get just the info for any specific, individual term I > > request. > > > > Note: using slashes provides the ability to get the best of both worlds - > > i.e., small responses when explicitly asking for info on just one term, but > > if you want info for all the terms in one HTTP response, then just serve up > > that complete vocab response when the base namespace IRI itself is > > dereferenced. > > > > Here's a nice simple illustration of the basic difference: > > - Slash: QUDT's 'CurrencyUnit' term (i.e., click on ' > > https://qudt.org/schema/qudt/CurrencyUnit') and you get a nice clean, > > concise, and precise set of info on just the one term you asked for - > > lovely! > > > > - Hash: DPV's 'JointDataControllers' (i.e., click on ' > > https://w3id.org/dpv#JointDataControllers') and you get bombarded with a > > huge document, with a daunting Table of Contents on the left, and info on > > hundreds of other terms that I didn't ask for, and so had no interest in > > whatsoever (don't get me wrong - this is fantastically detailed and > > potentially very useful information, but it's simply not what I asked for!). > > > > So based on the greater flexibility and future-proofing of using slash > > (i.e., it offers the best of both worlds, whereas hash is forever limited), > > I've become firmly of the opinion that slashes are just 'better' that > > hashes, and in fact are simply 'more correct' (i.e., all IRIs should be > > uniquely dereferencable). > > > > I also think the distinction is critically important when creating > > vocabularies intended for widespread and long-lasting use (such as the DPV > > vocab above). For throw-away or pet projects, sure, it doesn't really > > matter (yet even then, I still think slashes are the 'more correct' option). > > > > I know that the convention from the W3C has tended to be to use hashes, but > > I think in hindsight that was a mistake, and that the advice from the > > Semantic Web community as a whole should now be to adopt slashes > > consistently for all new vocabularies. (And it's not like using slash has > > no precedent - major 'authoritative' vocabs like QUDT, Schema.org, gist, > > SOSA, SSN, (even the venerable FOAF!) all use slash). > > > > I'd love to hear this group's thoughts. (For reference, I did ask the gist > > community if they recorded their discussions around their decision (in > > 2019) to formally switch gist from hash to slash (here > > <https://github.com/semanticarts/gist/issues/725>), but it seems they > > weren't recorded, and I've also raised the issue with the DPV group > > directly too (here <https://github.com/w3c/dpv/issues/53>)). > > > > Cheers, > > > > Pat. > > > > *Pat McBennett*, Technical Architect > > > > Contact | patm@inrupt.com > > > > Connect | WebID <http://pmcb55.inrupt.net/profile/card#me>, GitHub > > <https://github.com/pmcb55> > > > > Explore | www.inrupt.com > > > > -- > Antoine Zimmermann > École des Mines de Saint-Étienne > 158 cours Fauriel > CS 62362 > 42023 Saint-Étienne Cedex 2 > France > Tél:+33(0)4 77 49 97 02 > http://www.emse.fr/~zimmermann/ >
Received on Tuesday, 15 November 2022 15:32:54 UTC