- From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
- Date: Tue, 15 Nov 2022 16:16:24 +0100
- To: semantic-web@w3.org
- Cc: patm@inrupt.com
Dear semwebbers, Sorry to follow up on this conversation a little late but I'd like to add a few things, hopefully worth more than 2 cents. Overall, +1 to recommend, as a best practice, using a slash-based namespace for vocabs. A few comments regarding: 1. number of HTTP lookups 2. simplicity 3. "ontology terms don't mean much outside the context of the whole ontology" 4. Using hashes for other things 5. httpRange14 6. modularisation Regarding 1., real life experiments would have to be made because there are good reasons to think that, from a network perspective as a whole, slash IRIs are not an issue at all. In most cases, applications know what term to look for in data, and already know the ontologies that correspond to those terms. Only very rarely would an application crawl the Web from data documents to term documents to ontology documents. This would be very inefficient in most cases, and telling the world to use slashes rather than hashes (or single-term documentation rather than full ontology documentation) would only marginally affect this inefficiency, if at all (IMHO). With hash-based namespaces, there is also a potential for inefficient use of network, e.g. when caching is not possible for some reason. Regarding 2., yes, hash-based namespaces are simpler to setup and publish. But they are difficult to work with in the long term. In professional projects, there are tons of things that are cumbersome if applied to simple personal tasks, such as setting up a version control system for every piece of code or document one is authoring; or applying full-fledged collaborative development methodology for your hobby short novel writing. The burden of setting up a server with proper URI redirection is minuscule if you think of ontology development as a type of professional software project. It seems to me that justifying hash-based namespaces based on its simplicity is aiming at the lowest possible quality requirement. Regarding 3., why would this be a problem for ontologies and not for other kinds of linked data or knowledge graphs? In object oriented software development, a single class does not mean anything in isolation, yet most often each class is defined in a separate file. If you access these files on the Web (say, via Github), you don't have the context outside the class, and your class cannot even function without the other classes it relates to. Why wouldn't it be a problem too if it is a problem for ontologies? But in reality, it is not a problem because you can always download the whole package, the same as you can download the whole ontology from its ontology IRI. It should be rather easy to be directed to the whole ontology file when necessary, and yet allow one to simply get a documentation of a single term. Regarding 4., with slash-based namespaces, hashes can be used for other useful things. E.g., there could be a fragment of a term specification that provides usage examples like http://onto.org/Term#example (this is done in schema.org). There could be a section about history (e.g. when the term was added to the vocab, version info about the term itself http://onto.org/Term#history) or metadata (who created the term http://onto.org/Term#meta, related Github issues and discussions, etc.). Regarding 5., if a term like http://myonto.org/Person denotes a class of people, then it certainly isn't an information resource. However, if GET http://myonto.org/Person responds with a 200 OK, then, by httpRange14 resolution, the IRI must denote an information resource. A solution is to redirect to another IRI, say http://myonto.org/doc/Person, but this means yet another HTTP lookup. Instead, one could use http://myonto.org/Person# as an identifier for the term, and http://myonto.org/Person as an identifier for the RDF document that defines the term. Then it's using the best of both worlds: a slash-based namespace with a hash IRI. Regarding 6., with slash IRIs, the ontology can be modularised while preserving a single namespace. There can be modules http://myonto.org/module1 and http://myonto.org/module2 that provide each a distinct ontology that use the same namespace http://myonto.org/ for all terms, and, assuming a single slash-based namespace ont: #In ont:Term1 file: ont:Term1 rdfs:isDefinedBy ont:module1 . #In ont:Term2 file: ont:Term2 rdfs:isDefinedBy ont:module2 . It is also possible to redirect ont:Term1 to module1, and ont:Term2 to module2, if the ontology owner prefers to serve the whole module instead. Then there can be a global ontology document: ont: a owl:Ontology; owl:imports ont:module1, ont:module2 . This last option was an idea by my colleague Maxime Lefrançois who implemented it in the Smart Energy Aware Systems ontology: https://w3id.org/seas/ Given the many advantages I see, with tiny drawbacks, I can't understand how not recommending slash-based namespaces for vocabs be a tenable position. Best, --AZ Le 06/10/2022 à 16:10, Pat McBennett a écrit : > So (I think!) I know all the pro's and con's of using either a trailing > slash or a trailing hash for vocab namespace IRIs. Basically it boils down > to hashes meaning you'll always get info on all the terms in a vocabulary, > even if you only want info for one specific term, whereas using a slash > means I can always get just the info for any specific, individual term I > request. > > Note: using slashes provides the ability to get the best of both worlds - > i.e., small responses when explicitly asking for info on just one term, but > if you want info for all the terms in one HTTP response, then just serve up > that complete vocab response when the base namespace IRI itself is > dereferenced. > > Here's a nice simple illustration of the basic difference: > - Slash: QUDT's 'CurrencyUnit' term (i.e., click on ' > https://qudt.org/schema/qudt/CurrencyUnit') and you get a nice clean, > concise, and precise set of info on just the one term you asked for - > lovely! > > - Hash: DPV's 'JointDataControllers' (i.e., click on ' > https://w3id.org/dpv#JointDataControllers') and you get bombarded with a > huge document, with a daunting Table of Contents on the left, and info on > hundreds of other terms that I didn't ask for, and so had no interest in > whatsoever (don't get me wrong - this is fantastically detailed and > potentially very useful information, but it's simply not what I asked for!). > > So based on the greater flexibility and future-proofing of using slash > (i.e., it offers the best of both worlds, whereas hash is forever limited), > I've become firmly of the opinion that slashes are just 'better' that > hashes, and in fact are simply 'more correct' (i.e., all IRIs should be > uniquely dereferencable). > > I also think the distinction is critically important when creating > vocabularies intended for widespread and long-lasting use (such as the DPV > vocab above). For throw-away or pet projects, sure, it doesn't really > matter (yet even then, I still think slashes are the 'more correct' option). > > I know that the convention from the W3C has tended to be to use hashes, but > I think in hindsight that was a mistake, and that the advice from the > Semantic Web community as a whole should now be to adopt slashes > consistently for all new vocabularies. (And it's not like using slash has > no precedent - major 'authoritative' vocabs like QUDT, Schema.org, gist, > SOSA, SSN, (even the venerable FOAF!) all use slash). > > I'd love to hear this group's thoughts. (For reference, I did ask the gist > community if they recorded their discussions around their decision (in > 2019) to formally switch gist from hash to slash (here > <https://github.com/semanticarts/gist/issues/725>), but it seems they > weren't recorded, and I've also raised the issue with the DPV group > directly too (here <https://github.com/w3c/dpv/issues/53>)). > > Cheers, > > Pat. > > *Pat McBennett*, Technical Architect > > Contact | patm@inrupt.com > > Connect | WebID <http://pmcb55.inrupt.net/profile/card#me>, GitHub > <https://github.com/pmcb55> > > Explore | www.inrupt.com > -- Antoine Zimmermann École des Mines de Saint-Étienne 158 cours Fauriel CS 62362 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 49 97 02 http://www.emse.fr/~zimmermann/
Received on Tuesday, 15 November 2022 15:17:10 UTC