Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) ) from Dave Reynolds on 2011-10-20 (public-lod@w3.org from October 2011)

From: Dave Reynolds <dave.e.reynolds@gmail.com>
Date: Thu, 20 Oct 2011 22:31:03 +0100
To: Norman Gray <norman@astro.gla.ac.uk>
Cc: Leigh Dodds <leigh.dodds@talis.com>, "public-lod@w3.org" <public-lod@w3.org>
Message-ID: <1319146263.2488.165.camel@Obsidian3>
Hi Norman,

On Thu, 2011-10-20 at 12:13 +0100, Norman Gray wrote:

> On 2011 Oct 20, at 10:34, Dave Reynolds wrote:

> > Benefit 1: You can provide (meta)data separately about the IR and NIR
> > [...]
> > Counter argument: this is problematic anyway. If your IR can conneg to
> > both an HTML and an RDF representation then by webarch they should be
> > equivalent.
> 
> Where is this written (I can't find support for this in a quick search through <http://www.w3.org/TR/webarch/>?

Good point. I've been in a number of discussions where equivalence (up
to some fuzzy notion of quality) has been assumed to be required.
However, you are right, I can't see any documentary evidence backing up
that assumption. Happy to relegate it to a red herring.

> > Benefit 2: Conceptual cleanliness and hedging your bets
> > 
> > [...]Even if we can't spot the practical problems right now
> > then differentiating between the galaxy itself and some piece of data
> > about the galaxy could turn out to be important in practice.
> 
> It is.  I want to say that 'line 123 in this catalogue [an existing RDBMS] and line 456 in that one both refer to the same galaxy, but they give different values for its surface brightness'.  There's no way I can articulate that unless I'm explicitly clear about the difference between a billion suns and a database row.

Sure, differentiating *those* two is crucial but http-range-14 doesn't
itself solve that [*] any more than inserting a # character would.

Perhaps benefit 2 could be reframed as being about forcing you to
confront the map/territory distinction so you end up doing better
modelling - whether or not you implement 303s.

> > Cost 1: You have to decide if your resource is an IR or NIR and we can't
> > always
> > 
> > If you are going to have a distinction like IR/NIR you'd better be able
> > to explain it and work out which is which. We can't. It's OK for real
> > world objects which "clearly" can't go down the wire[2]. But anything
> > conceptual can be argued both ways - skos:Concepts, skos:ConceptSchemes,
> > qb:DataSets, rdf:Properties, eg:theColourRed. 
> >
> > Person A: you can get your ontology / skos description / glossary entry
> > down the wire, that's all there is, so they are IRs. 
> 
> OK, I can see this point.
> 
> I think your Person A is being either difficult or dense, 

Or accurately understanding that an ontology is a conceptualization, a
model, it is not reality :)

> but supposing it genuinely is that hard to draw a distinction in some case, then it is probably correspondingly unlikely that there are importantly different things to say about the putative IR and NIR, so the distinction may not in fact matter.

Sure. The issue is around spending lots of time and energy in
discussions around that in cases where it doesn't change any outcomes.
Especially in domains where this is the sort of information that
dominates.

> > Cost 3: Developer confusion/disbelief, inhibiting use
> > 
> > The clear cut cases like galaxies ([2] notwithstanding) are so silly
> > than no one thinks this confusion could ever arise. For the less clear
> > cases like skos:Concepts the discussion seems like dancing on the heads
> > of pins. Followed by "if this distinction is so important why is there
> > no a way to tell that I have an NIR" - the http-range-14 solution only
> > says that it could be an NIR. 
> > 
> > The need to understand, implement and argue about this distinction
> > without the benefits actually being apparent *right now* *to me* is a
> > serious barrier to uptake.
> 
> I think the above argument works here, too.  If a provider can't see the distinction, they're probably not going to say anything usefully distinct about the two resources.
> 
> Perhaps that should be the resolution: "Dear Developer, there's a right way to do this, and a less right way: the right way probably gives benefit to you and is better for your data's consumers, but if you do it the other way, the world won't end.  Love and kisses, public-lod."

The last part of that is right on - either can be made to work, don't
kill yourself over it.

I think the discussion Leigh was trying to start was "can we more
clearly article those benefits of the 'right way'". I was taking a shot
a that, maybe a very limited off-target one.

> Is the argument actually about data _consumers_ getting confused about the distinction?  Really?  I'd have thought that, once you've grokked RDF, you're in a good place to understand the distinction fairly naturally, and in any case you are by that stage looking at a screenful of RDF which is describing a URI whose internal structure and 30xs you no longer have to care about.  I bet I'm missing a use-case.

No the issue is more for data publishers IMHO.

What's more I really don't think the issues is about not understanding
about the distinction (at least in the clear cut cases). Most people I
talk to grok the distinction, the hard bit is understanding why 303
redirects is a sensible way of making it and caring about it enough to
put those in place.

Cheers,
Dave

[*] Maybe that's a bit to succinct, let me expand ...

There's lots of ways of handling that sort of situation. Three obvious
ones being:

(1) Describe the observations explicitly using something like ISO O&M or
the DataCube vocabulary:

   <http://catalogue1.com/observation123> a qb:Observation;
       eg:galaxy      <http://iau.org/id/galaxy/m31>;
       eg:brightness  6.5 ;
       eg:obsdate     '2011-10-10'^^xsd:date ;
       qb:dataset     <http://catalogue1.com/catalogue/2011> .

   <http://catalogue2.com/observation456> a qb:Observation;
       eg:galaxy      <http://iau.org/id/galaxy/m31>;
       eg:brightness  6.8 ;
       eg:obsdate     '2011-09-01'^^xsd:date ;
       qb:dataset     <http://catalogue2.com/catalogue/2011> .

(2) Each catalogue gives its own URI to its "understanding" of the
galaxy so it can assert things directly about it without conflict:

   <http://catalogue1.com/galaxy/m31>  eg:brightness 6.5;
      eg:correspondsTo    <http://iau.org/id/galaxy/m31> .

   <http://catalogue2.com/galaxy/m31>  eg:brightness 6.8;
      eg:correspondsTo    <http://iau.org/id/galaxy/m31> .

(3) Each catalogue publishes the data directly about the galaxy (ugh)
but since it does that at a different place we can use named graphs to
track the provenance:

   { 
      <http://iau.org/id/galaxy/m31> eg:brightness 6.5 . 
      ... lots more data ...
   }  <http://catalogue1.com/catalogue/2011>

   { 
      <http://iau.org/id/galaxy/m31> eg:brightness 6.8 . 
      ... lots more data ...
   }  <http://catalogue2.com/catalogue/2011>


In *none* of those cases doesn't it make any difference whether when I
dereference <http://iau.org/id/galaxy/m31> in a browser I get a web page
saying "I denote the galaxy M31" or I get a 303 redirect to something
like <http://iau.org/doc/galaxy/m31> which in turn connegs to a web page
saying "The URI you started with denoted the galaxy M31, me I'm just a
web page, you can tell me by the way I walk".

Of course if you dereference <http://iau.org/id/galaxy/m31> asking for
RDF then you might hope to get back some information on M31 including
its brightness. In that case what should the IAU return? Maybe it will
have a committee decide what the current agreed brightness is. Maybe it
will provide rdfs:seeAlso links to the catalogues. Maybe it will include
data from both catalogues in-line using reification or O&M or anything.
The point is in none those cases does it make any  difference if I do a
303 or not, or indeed if I switch to using
<http://iau.org/galaxy/m31#thing> instead.

Now if <http://iau.org/doc/galaxy/m31> in fact returns a poem about M31
and we want to assert copyright about the poem then we had better not
confuse it with <http://iau.org/id/galaxy/m31>. But that's benefit 1 and
personally I would advice the IAU to have a separate set of pages for
poems about galaxies and link to those from the galaxy pages.
Received on Thursday, 20 October 2011 21:31:36 UTC