- From: David Booth <david@dbooth.org>
- Date: Tue, 15 Nov 2022 21:31:15 -0500
- To: semantic-web@w3.org
On 11/15/22 10:53, Antoine Zimmermann wrote: > I could very well edit the [wiki] page that you link to and say that > hash IRIs are not OK, then W3C wiki would suggest that *not* both are OK. Please do, but inclusively and respectfully, and with the explanations that you gave. That's what a wiki is for! Thanks, David Booth > > --AZ > > Le 15/11/2022 à 16:32, Martynas Jusevičius a écrit : >> W3C wiki suggests that both approaches are OK: >> https://www.w3.org/wiki/HashVsSlash >> >> On Tue, Nov 15, 2022 at 4:22 PM Antoine Zimmermann >> <antoine.zimmermann@emse.fr> wrote: >>> >>> Dear semwebbers, >>> >>> >>> Sorry to follow up on this conversation a little late but I'd like to >>> add a few things, hopefully worth more than 2 cents. >>> >>> Overall, +1 to recommend, as a best practice, using a slash-based >>> namespace for vocabs. >>> >>> A few comments regarding: >>> 1. number of HTTP lookups >>> 2. simplicity >>> 3. "ontology terms don't mean much outside the context of the whole >>> ontology" >>> 4. Using hashes for other things >>> 5. httpRange14 >>> 6. modularisation >>> >>> Regarding 1., real life experiments would have to be made because there >>> are good reasons to think that, from a network perspective as a whole, >>> slash IRIs are not an issue at all. In most cases, applications know >>> what term to look for in data, and already know the ontologies that >>> correspond to those terms. Only very rarely would an application crawl >>> the Web from data documents to term documents to ontology documents. >>> This would be very inefficient in most cases, and telling the world to >>> use slashes rather than hashes (or single-term documentation rather than >>> full ontology documentation) would only marginally affect this >>> inefficiency, if at all (IMHO). With hash-based namespaces, there is >>> also a potential for inefficient use of network, e.g. when caching is >>> not possible for some reason. >>> >>> Regarding 2., yes, hash-based namespaces are simpler to setup and >>> publish. But they are difficult to work with in the long term. In >>> professional projects, there are tons of things that are cumbersome if >>> applied to simple personal tasks, such as setting up a version control >>> system for every piece of code or document one is authoring; or applying >>> full-fledged collaborative development methodology for your hobby short >>> novel writing. The burden of setting up a server with proper URI >>> redirection is minuscule if you think of ontology development as a type >>> of professional software project. It seems to me that justifying >>> hash-based namespaces based on its simplicity is aiming at the lowest >>> possible quality requirement. >>> >>> Regarding 3., why would this be a problem for ontologies and not for >>> other kinds of linked data or knowledge graphs? In object oriented >>> software development, a single class does not mean anything in >>> isolation, yet most often each class is defined in a separate file. If >>> you access these files on the Web (say, via Github), you don't have the >>> context outside the class, and your class cannot even function without >>> the other classes it relates to. Why wouldn't it be a problem too if it >>> is a problem for ontologies? But in reality, it is not a problem because >>> you can always download the whole package, the same as you can download >>> the whole ontology from its ontology IRI. It should be rather easy to be >>> directed to the whole ontology file when necessary, and yet allow one to >>> simply get a documentation of a single term. >>> >>> Regarding 4., with slash-based namespaces, hashes can be used for other >>> useful things. E.g., there could be a fragment of a term specification >>> that provides usage examples like http://onto.org/Term#example (this is >>> done in schema.org). There could be a section about history (e.g. when >>> the term was added to the vocab, version info about the term itself >>> http://onto.org/Term#history) or metadata (who created the term >>> http://onto.org/Term#meta, related Github issues and discussions, etc.). >>> >>> Regarding 5., if a term like http://myonto.org/Person denotes a class of >>> people, then it certainly isn't an information resource. However, if GET >>> http://myonto.org/Person responds with a 200 OK, then, by httpRange14 >>> resolution, the IRI must denote an information resource. A solution is >>> to redirect to another IRI, say http://myonto.org/doc/Person, but this >>> means yet another HTTP lookup. Instead, one could use >>> http://myonto.org/Person# as an identifier for the term, and >>> http://myonto.org/Person as an identifier for the RDF document that >>> defines the term. Then it's using the best of both worlds: a slash-based >>> namespace with a hash IRI. >>> >>> Regarding 6., with slash IRIs, the ontology can be modularised while >>> preserving a single namespace. There can be modules >>> http://myonto.org/module1 and http://myonto.org/module2 that provide >>> each a distinct ontology that use the same namespace http://myonto.org/ >>> for all terms, and, assuming a single slash-based namespace ont: >>> >>> #In ont:Term1 file: >>> ont:Term1 rdfs:isDefinedBy ont:module1 . >>> >>> #In ont:Term2 file: >>> ont:Term2 rdfs:isDefinedBy ont:module2 . >>> >>> It is also possible to redirect ont:Term1 to module1, and ont:Term2 to >>> module2, if the ontology owner prefers to serve the whole module >>> instead. Then there can be a global ontology document: >>> >>> ont: a owl:Ontology; >>> owl:imports ont:module1, ont:module2 . >>> >>> This last option was an idea by my colleague Maxime Lefrançois who >>> implemented it in the Smart Energy Aware Systems ontology: >>> https://w3id.org/seas/ >>> >>> >>> Given the many advantages I see, with tiny drawbacks, I can't understand >>> how not recommending slash-based namespaces for vocabs be a tenable >>> position. >>> >>> >>> Best, >>> --AZ >>> >>> Le 06/10/2022 à 16:10, Pat McBennett a écrit : >>>> So (I think!) I know all the pro's and con's of using either a trailing >>>> slash or a trailing hash for vocab namespace IRIs. Basically it >>>> boils down >>>> to hashes meaning you'll always get info on all the terms in a >>>> vocabulary, >>>> even if you only want info for one specific term, whereas using a slash >>>> means I can always get just the info for any specific, individual >>>> term I >>>> request. >>>> >>>> Note: using slashes provides the ability to get the best of both >>>> worlds - >>>> i.e., small responses when explicitly asking for info on just one >>>> term, but >>>> if you want info for all the terms in one HTTP response, then just >>>> serve up >>>> that complete vocab response when the base namespace IRI itself is >>>> dereferenced. >>>> >>>> Here's a nice simple illustration of the basic difference: >>>> - Slash: QUDT's 'CurrencyUnit' term (i.e., click on ' >>>> https://qudt.org/schema/qudt/CurrencyUnit') and you get a nice clean, >>>> concise, and precise set of info on just the one term you asked for - >>>> lovely! >>>> >>>> - Hash: DPV's 'JointDataControllers' (i.e., click on ' >>>> https://w3id.org/dpv#JointDataControllers') and you get bombarded >>>> with a >>>> huge document, with a daunting Table of Contents on the left, and >>>> info on >>>> hundreds of other terms that I didn't ask for, and so had no >>>> interest in >>>> whatsoever (don't get me wrong - this is fantastically detailed and >>>> potentially very useful information, but it's simply not what I >>>> asked for!). >>>> >>>> So based on the greater flexibility and future-proofing of using slash >>>> (i.e., it offers the best of both worlds, whereas hash is forever >>>> limited), >>>> I've become firmly of the opinion that slashes are just 'better' that >>>> hashes, and in fact are simply 'more correct' (i.e., all IRIs should be >>>> uniquely dereferencable). >>>> >>>> I also think the distinction is critically important when creating >>>> vocabularies intended for widespread and long-lasting use (such as >>>> the DPV >>>> vocab above). For throw-away or pet projects, sure, it doesn't really >>>> matter (yet even then, I still think slashes are the 'more correct' >>>> option). >>>> >>>> I know that the convention from the W3C has tended to be to use >>>> hashes, but >>>> I think in hindsight that was a mistake, and that the advice from the >>>> Semantic Web community as a whole should now be to adopt slashes >>>> consistently for all new vocabularies. (And it's not like using >>>> slash has >>>> no precedent - major 'authoritative' vocabs like QUDT, Schema.org, >>>> gist, >>>> SOSA, SSN, (even the venerable FOAF!) all use slash). >>>> >>>> I'd love to hear this group's thoughts. (For reference, I did ask >>>> the gist >>>> community if they recorded their discussions around their decision (in >>>> 2019) to formally switch gist from hash to slash (here >>>> <https://github.com/semanticarts/gist/issues/725>), but it seems they >>>> weren't recorded, and I've also raised the issue with the DPV group >>>> directly too (here <https://github.com/w3c/dpv/issues/53>)). >>>> >>>> Cheers, >>>> >>>> Pat. >>>> >>>> *Pat McBennett*, Technical Architect >>>> >>>> Contact | patm@inrupt.com >>>> >>>> Connect | WebID <http://pmcb55.inrupt.net/profile/card#me>, GitHub >>>> <https://github.com/pmcb55> >>>> >>>> Explore | www.inrupt.com >>>> >>> >>> -- >>> Antoine Zimmermann >>> École des Mines de Saint-Étienne >>> 158 cours Fauriel >>> CS 62362 >>> 42023 Saint-Étienne Cedex 2 >>> France >>> Tél:+33(0)4 77 49 97 02 >>> http://www.emse.fr/~zimmermann/ >>> >> >
Received on Wednesday, 16 November 2022 02:31:33 UTC