- From: Dave Reynolds <dave.e.reynolds@gmail.com>
- Date: Thu, 20 Oct 2011 10:34:02 +0100
- To: Leigh Dodds <leigh.dodds@talis.com>
- Cc: "public-lod@w3.org" <public-lod@w3.org>
Hi Leigh, On Wed, 2011-10-19 at 17:59 +0100, Leigh Dodds wrote: > So, can we turn things on their head a little. Instead of starting out > from a position that we *must* have two different resources, can we > instead highlight to people the *benefits* of having different > identifiers? That makes it more of a best practice discussion and one > based on trade-offs: e.g. this class of software won't be able to > process your data correctly, or you'll be limited in how you can > publish additional data or metadata in the future. Nice approach. Here's an attempt ... Benefit 1: You can provide (meta)data separately about the IR and NIR Sometimes the IR contains additional information (e.g. crafted BBC web pages) or was produced by a non-trivial transformation from the NIR. In those cases metadata such as license, copyright and provenance information differ between the IR and NIR. Hence you need two identifiers. Counter argument: this is problematic anyway. If your IR can conneg to both an HTML and an RDF representation then by webarch they should be equivalent. So a handcrafted web page with different license terms is not a presentation of the NIR it is just some interesting semi-related web page :) Benefit 2: Conceptual cleanliness and hedging your bets In the field of human debate, as opposed to what machines do, we are now clear that "the map is not the territory" but we weren't always so clear and that led to confusion and erroneous arguments[1]. That learning may be transferable. Even if we can't spot the practical problems right now then differentiating between the galaxy itself and some piece of data about the galaxy could turn out to be important in practice. If you have two resources and later on it turns out you only needed one, no big deal just declare their equivalence. If you have one resource where later on it turns out you needed two then you are stuffed. Cost 1: You have to decide if your resource is an IR or NIR and we can't always If you are going to have a distinction like IR/NIR you'd better be able to explain it and work out which is which. We can't. It's OK for real world objects which "clearly" can't go down the wire[2]. But anything conceptual can be argued both ways - skos:Concepts, skos:ConceptSchemes, qb:DataSets, rdf:Properties, eg:theColourRed. Person A: you can get your ontology / skos description / glossary entry down the wire, that's all there is, so they are IRs. Person B: abstract concept can't go down the wire so they are NIRs. Deadlock. Cost 2: Network cost - an uncachable round triple every time I look up a data resource Counter argument: just use # Cost 3: Developer confusion/disbelief, inhibiting use The clear cut cases like galaxies ([2] notwithstanding) are so silly than no one thinks this confusion could ever arise. For the less clear cases like skos:Concepts the discussion seems like dancing on the heads of pins. Followed by "if this distinction is so important why is there no a way to tell that I have an NIR" - the http-range-14 solution only says that it could be an NIR. The need to understand, implement and argue about this distinction without the benefits actually being apparent *right now* *to me* is a serious barrier to uptake. Personally I find the costs more persuasive than the benefits but I've tried to present the arguments neutrally. Dave [1] IANAP and can't even spell Korzybski without Google's help :) [2] You can take this line further. Arguably eg:theMilkyWay is never going to represent the galaxy itself, it is only ever a conceptualization of it and that conceptualization *can* be encoded in some language and sent down the wire. We are *never* really talking about territories we are always talking about maps and postit notes stuck on maps.
Received on Thursday, 20 October 2011 09:34:33 UTC