Re: NIR SIDETRACK Re: Change Proposal for HttpRange-14 from Norman Gray on 2012-03-28 (public-lod@w3.org from March 2012)

From: Norman Gray <norman@astro.gla.ac.uk>
Date: Wed, 28 Mar 2012 18:59:05 +0100
To: Jonathan A Rees <rees@mumble.net>
Cc: Michael Brunnbauer <brunni@netestate.de>, Tim Berners-Lee <timbl@w3.org>, public-lod community <public-lod@w3.org>
Message-Id: <4F251EA9-5349-41F8-B6CA-53420845EE31@astro.gla.ac.uk>
Greetings.

[This is a late response, because I dithered about sending it, because this whole thing seems simple enough that I've got to be missing stuff]

On 2012 Mar 27, at 14:02, Jonathan A Rees wrote:

> On Tue, Mar 27, 2012 at 7:52 AM, Michael Brunnbauer <brunni@netestate.de> wrote:
>> 
>> Hello Tim,
>> 
>> On Mon, Mar 26, 2012 at 04:59:42PM -0400, Tim Berners-Lee wrote:
>>> 12) Still people say "well, to know whether I use 200 or 303 I need to know if this sucker is an IR or NIR" when instead they should be saying "Well, am I going to serve the content of this sucker or information about it?".
>> 
>> I think the question should be "does the response contain the content of it"
>> because I can serve both at once (<foaf:PersonalProfileDocument rdf:about="">).
> 
> Yes, this is the question - is the retrieved representation content (I
> used the word "instance" but it's not catching on), or description. It
> can be both.

Fine -- that seems the key question.  In some ideal world, everything on the web would come with RDF which explained what it was; but expecting that ever to happen would be mad.

The HR14 resolution gives one answer to this, by doing _two_ things.

Step 1. HR14 declares the existence of a subset of resources named 'IR'.  You can gloss this set as 'information resource', or 'document', note that the set is vague, or deny that the set is important, but that doesn't matter.

Step 2. HR14 gives a partial algorithm for deciding whether a URI X names a resource in IR:  If you get a 200 when you dereference X, the resource is conclusively in IR.  End of story.

(you can all suck eggs, now, yes?)

Why does the set IR matter? (and pace Tim and various weary voices in this metathread, I think it does matter).  Because saying 'X names a resource in IR' tells you that the URI and the associated resource have a Particularly Simple Relationship -- the content of the HTTP retrieval is the 'content' of the resource (in some way which probably doesn't have to be precise, but which asserts that resource is something, unlike a Macaw, that can come through a network).  In this way -- crucially -- it answers Tim's question (12) above: retrieving X with a 200 status obtains the content of the sucker.  So the concept of 'IR' does do some work because it gives the client information about the object.

Right?

BUT, we (obviously) also want to talk about things where there's a slightly more complicated relationship between the URI and some resource (eg a URI which names a bird).  In this case, the extra information (that the URI and the resource have a Particularly Simple Relationship) would be false.  The cost of a particularly simple step 2 above, is the (in retrospect variously costly) indirection of the 303-dance.

So the whole discussion seems to be about whether and how to relax step 2.  Jeni Tennison's proposal says it should be relaxed in the presence of a 'describedby' link, David Booth's that it should be relaxed with a new definedby link, or a (self-)reference with rdfs:isDefinedBy.  My 'proposal' was that it could be relaxed even more minimally, by saying that placing the resource in IR (step 2 above) could be done by the client only if this didn't contradict any RDF in the content of the resource (because the RDF said that X named a person, say), however conveyed (and of course these two proposals achieve that).

After all this torrent of message (and I have honestly tried to read a significant fraction of them, and associated documents), I'm still not seeing how this is problematic.  Perhaps I'm slow, or I've read the wrong fraction of messages.

  * Anything that was HR14-compliant will still be compliant with the relaxed Step 2. No change.

  * Any resource that wasn't in IR before, but whose URI nonetheless produced 200, was formally broken. It was telling lies.  With a relaxed Step 2, it now won't be broken any more.  Some applications (Tabulator?) will have to change to respect that, but they couldn't tell they were being lied to before, so they're merely exchanging one problem for a fixable one.

  * This is insensitive to the definition of 'information resource', and it doesn't matter if the content is multiple things.  If a resource 200-says that its URI names a Book, then you don't have to worry whether that's an 'information resource' or not, because you know it's a book; end of algorithm; do not go to the end of Step 2; do not add any extra information hacked/derived from protocol details.

That seems an inexpensive change which un-breaks a lot of things.

All the best (in some puzzlement),

Norman


-- 
Norman Gray  :  http://nxg.me.uk
SUPA School of Physics and Astronomy, University of Glasgow, UK
Received on Wednesday, 28 March 2012 17:59:39 UTC