NIR SIDETRACK Re: Change Proposal for HttpRange-14 from Tim Berners-Lee on 2012-03-26 (public-lod@w3.org from March 2012)

From: Tim Berners-Lee <timbl@w3.org>
Date: Mon, 26 Mar 2012 16:59:42 -0400
To: Norman Gray <norman@astro.gla.ac.uk>
Cc: Michael Brunnbauer <brunni@netestate.de>, public-lod community <public-lod@w3.org>
Message-Id: <91DA9A68-C960-4917-B1A5-A3C93D92E34E@w3.org>
On 2012-03 -25, at 14:06, Norman Gray wrote:

> 
> Tim, greetings.
> 
> On 2012 Mar 25, at 17:35, Tim Berners-Lee wrote:
> 
>> (Not useful to  talk about NIRs.  The web architecture does not. Now does Jonathan's baseline, not HTTP Range-14.  Never assume that what an IR is about is not itself a IR.)
> 
> Well, httpRange-14 sort of does talk about 'non-information resources', by necessary implication.  

Of course you can define the class but I said it isn't useful to talk about it.
That was an understatement.  It has wasted person-centuries of work.
Let me give a potted history for newcomers:

<pinch of salt>
1) The TAG wanted to settle whether, after a 200 response, the URI always referred to a document which you just got a representation of.
2) They - we - foolishly cutely phrased the issue "what is range of the HTTP dereference function". Mistake.
   (This is the function mapping URI to HTTP entity aka HTTP representation, aka content.  So its range would e representation -- but we meant what does the URI denote if you use it in say an RDF system -- more like the range of denotation relation for HTTP hashless URIs )
3) They figured the semantics of the HTTP deref function were the relationship between the name for a document and the the contents of the document.
4) So in that case the domain of the function is name (URI in fact) and the domain is representation, and the URI denotes a document (Information Resource in fact). Which is not a big deal.
5)  Nor is the exact definition of the class "document" a big deal.
6) People then for some reason thought, "oh, if I am running a server, then I must test everything I am serving to make sure it is an IR before I serve it" -- oh no -- how can I make that test?  We must have a decision algorithm! Mistake.
7) They should have asked "For each URI, what is the content of the document it names?"
8) Instead they argued for years about the edge cases of what exactly was as what wasn't a IR.
Is a book? Is a girl with a tattooed poem? A page which says it is a person? A fridge?
9) Instead they should have thought "Am I serving the contents of this, or am I serving data about this"?  If I am serving the contents then I will use its URI; otherwise I will use a different URI for the document.
9.5) People actually experimented -- served up girls with tatoos and pages opining they were not pages and everything. For years.
10) (Ignore DanBri who now suggests that you coud argue forever about the difference between the content of something and a description of it. He only does it to annoy because he knows it teases.)
11) After a few years enough people such as Ian D said they wanted an alternative architecture, where do a get on the URI of a thing, and a document about the thing is returned, and the URI is not the URI of the document. That lead to the adoption of the 303. Which is still a problem as it takes time.
12) Still people say "well, to know whether I use 200 or 303 I need to know if this sucker is an IR or NIR" when instead they should be saying "Well, am I going to serve the content of this sucker or information about it?". 
13) In fact lots of times, people serve information *about* something, not its contents, even though it has contents (it is an IR).
14) That's why I said "Never assume that what an IR is about is not itself a IR"
15) That's why I said "Not useful to  talk about NIRs". In brief.  Or you can scan the email archive and see the long version.
</pinch of salt>


> If the set of information resources (IR) is not the same as the set of all resources (R), then the set R\IR (which in any case exists) is non-null, and might as well be called the set of 'non-information-resources' as anything else.  But perhaps R\IR is a better notation. (I don't intend this to be hair-splitting)

What exactly do you mean by hair-splitting?

> Parenthetically, what _is_ IR?

You can't define a set we are going to use mathematically exactly in terms of the real world.
or people will always agued edge cases.  You can define a set of the things mathematically
in terms of each other, or you can try to define them in words like an encyclopaedia, 
but if you try to both you get endless arguments, as people argue that the terms you 


Documents like Jonathan's carefully defined these functions in terms of each other,
and there have been millions of attempts to explain to different people on the lists
in terms they understand, but if you 

Parenthetically, what is a Resource? Actually, what is an architecture? 

>  Referring to Rees's editors draft [1], [issue-14-resolved] effectively says that iff a resource X is 200-retrieved, then it must _always_ be assigned to the set IR (the resolution seems to effectively define 'being 200-retrievable' as the definition of 'information resource', and this is consistent with [1] section 1.1 which says "One convention[...] was for a hashless URI to refer to the document-like entity ("information resource") served at that URI").

Yes.

> So my phrasing was intended to weaken [issue-14-resolved] to suggest that X being 200-retrievable puts X in IR, _only_ if the documentation about X (retrieved by conneg on X, say) does not put it in R\IR.

You see, you have helped demonstrate that talking about RNR is not helpful, 
as you have done it and you are asking the wrong question.  

The question is is the response of the HTTP function the content of the thing or a description of it?
Sure, if you know it is in RNR you know that the message can't be its content, as things in RNR
don't have content.  But the reverse is not true.
If you know it is in IR, you don't know whether the thing you got was content of it or a description
of it, as IRs can have both. 



> How something is put into R\IR is a separate issue.  Perhaps there's a need for a class std:RnotIR, or perhaps this is up to the client, who may decide that discovering that 'X a foaf:Person' is enough to put it in R\IR for the client's purposes.
> 
> ----
> Example:
> 
> So, if X=http://example.org/cedric 200-returns
> 
>    <> foaf:name "Cedric".
> 
> then X is in IR, and oddly enough has a name (the domain of foaf:name isn't restricted to foaf:Person).  If it 200-returns
> 
>    <> foaf:name "Cedric"; 
>        a foaf:Person.
> 
> then the client should deem X to be in R\IR.


But supposing it says 

	<> dc:title "Moby dick"; a foaf:Document ?

then you know nothing.

What is the page says no RDF at all? Like about 10^11 pages. You know nothing.

=> Not useful to  talk about NIRs.

Tim



> ----
> 
> This does mean that the RDF description document which has been retrieved from the URI X doesn't have a name at this point.  But if that matters to the owner of X (perhaps because they want to refer to how the description document is licensed), then this minority (?) situation can be managed by having retrieval of X produce
> 
>    <> a foaf:Person; 
>        eg:describedBy <http://example.org/cedric-description>.
>    <http://example.org/cedric-description> eg:licensed <cc-by>.
> 
> That places X in R\IR, and indicates a description document about which anything one wishes can be asserted.
> 
> All the best,
> 
> Norman
> 
> 
> [1] http://www.w3.org/2001/tag/doc/uddp-20120229/
> 
> -- 
> Norman Gray  :  http://nxg.me.uk
> SUPA School of Physics and Astronomy, University of Glasgow, UK
> 
>
Received on Monday, 26 March 2012 20:59:55 UTC