Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) ) from Dave Reynolds on 2011-10-20 (public-lod@w3.org from October 2011)

From: Dave Reynolds <dave.e.reynolds@gmail.com>
Date: Thu, 20 Oct 2011 10:34:02 +0100
To: Leigh Dodds <leigh.dodds@talis.com>
Cc: "public-lod@w3.org" <public-lod@w3.org>
Message-ID: <1319103242.2488.52.camel@Obsidian3>

Hi Leigh,

On Wed, 2011-10-19 at 17:59 +0100, Leigh Dodds wrote:

> So, can we turn things on their head a little. Instead of starting out
> from a position that we *must* have two different resources, can we
> instead highlight to people the *benefits* of having different
> identifiers? That makes it more of a best practice discussion and one
> based on trade-offs: e.g. this class of software won't be able to
> process your data correctly, or you'll be limited in how you can
> publish additional data or metadata in the future.

Nice approach. Here's an attempt ...

Benefit 1: You can provide (meta)data separately about the IR and NIR

Sometimes the IR contains additional information (e.g. crafted BBC web
pages) or was produced by a non-trivial transformation from the NIR. In
those cases metadata such as license, copyright and provenance
information differ between the IR and NIR. Hence you need two
identifiers.

Counter argument: this is problematic anyway. If your IR can conneg to
both an HTML and an RDF representation then by webarch they should be
equivalent. So a handcrafted web page with different license terms is
not a presentation of the NIR it is just some interesting semi-related
web page :)

Benefit 2: Conceptual cleanliness and hedging your bets

In the field of human debate, as opposed to what machines do, we are now
clear that "the map is not the territory" but we weren't always so clear
and that led to confusion and erroneous arguments[1]. That learning may
be transferable. Even if we can't spot the practical problems right now
then differentiating between the galaxy itself and some piece of data
about the galaxy could turn out to be important in practice.

If you have two resources and later on it turns out you only needed one,
no big deal just declare their equivalence. If you have one resource
where later on it turns out you needed two then you are stuffed.

Cost 1: You have to decide if your resource is an IR or NIR and we can't
always

If you are going to have a distinction like IR/NIR you'd better be able
to explain it and work out which is which. We can't. It's OK for real
world objects which "clearly" can't go down the wire[2]. But anything
conceptual can be argued both ways - skos:Concepts, skos:ConceptSchemes,
qb:DataSets, rdf:Properties, eg:theColourRed. 
Person A: you can get your ontology / skos description / glossary entry
down the wire, that's all there is, so they are IRs. 
Person B: abstract concept can't go down the wire so they are NIRs.
Deadlock.

Cost 2: Network cost - an uncachable round triple every time I look up a
data resource

Counter argument: just use #

Cost 3: Developer confusion/disbelief, inhibiting use

The clear cut cases like galaxies ([2] notwithstanding) are so silly
than no one thinks this confusion could ever arise. For the less clear
cases like skos:Concepts the discussion seems like dancing on the heads
of pins. Followed by "if this distinction is so important why is there
no a way to tell that I have an NIR" - the http-range-14 solution only
says that it could be an NIR. 

The need to understand, implement and argue about this distinction
without the benefits actually being apparent *right now* *to me* is a
serious barrier to uptake.

Personally I find the costs more persuasive than the benefits but I've
tried to present the arguments neutrally.

Dave

[1] IANAP and can't even spell Korzybski without Google's help :)

[2] You can take this line further. Arguably eg:theMilkyWay is never
going to represent the galaxy itself, it is only ever a
conceptualization of it and that conceptualization *can* be encoded in
some language and sent down the wire. We are *never* really talking
about territories we are always talking about maps and postit notes
stuck on maps.

Received on Thursday, 20 October 2011 09:34:33 UTC