Re: Historical - Re: Proposed IETF/W3C task force: "Resource meaning" Review of new HTTPbis text for 303 See Other from noah_mendelsohn@us.ibm.com on 2009-08-25 (www-tag@w3.org from August 2009)

From: <noah_mendelsohn@us.ibm.com>
Date: Tue, 25 Aug 2009 11:03:03 -0400
To: Mike Bergman <mike@mkbergman.com>
Cc: "Roy T. Fielding" <fielding@gbiv.com>, Julian Reschke <julian.reschke@gmx.de>, Larry Masinter <masinter@adobe.com>, Mark Nottingham <mnot@mnot.net>, Pat Hayes <phayes@ihmc.us>, Tim Berners-Lee <timbl@w3.org>, W3C TAG <www-tag@w3.org>
Message-ID: <OFBC8DD07C.B92297AF-ON8525761D.0050113B-8525761D.0052AC16@lotus.com>
Mike Bergman writes:

> I don't know whether they would accept the commission (as I have 
> suggested before [1]), but I again suggest the TAG appoint Roy 
> Fielding and Pat Hayes to work jointly to present to the TAG a 
> resolution to these vexing terminology and semantics issues.

I think I speak for the TAG in saying that we welcome constructive 
proposals from anyone in the community, and that would include Roy and/or 
Pat, should they decide to offer any in this area.  From a process point 
of view, the TAG doesn't "appoint" anyone other than TAG members to do 
work like this, though we do sometimes accept offers from people who want 
to work with us.  I expect that Roy and Pat have seen your suggestion. 
They both know the TAG and its ways very well, and if they wanted to 
undertake work on this, they'll know how to coordinate appropriately.

> Further, if the TAG were to agree in advance to accept a 
> consensus recommendation from them, I think that goodwill and 
> intelligence will prevail. I, for one, would agree to the 
> recommendation.

Honestly, I don't think this suggestion properly reflects the role of the 
TAG.  The TAG adds value when it offers a considered review of the pros 
and cons of particular drafts or other proposals.  We can't do that by 
agreeing in advance, even to suggestions from people as well respected as 
Roy and Pat. 

Noah
W3C TAG co-chair

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------








Mike Bergman <mike@mkbergman.com>
08/24/2009 09:01 PM
 
        To:     noah_mendelsohn@us.ibm.com
        cc:     Tim Berners-Lee <timbl@w3.org>, "Roy T. Fielding" 
<fielding@gbiv.com>, Julian Reschke <julian.reschke@gmx.de>, Larry 
Masinter <masinter@adobe.com>, Mark Nottingham <mnot@mnot.net>, Pat Hayes 
<phayes@ihmc.us>, W3C TAG <www-tag@w3.org>
        Subject:        Re: Historical - Re: Proposed IETF/W3C task force: 
"Resource meaning" Review  of new HTTPbis text for 303 See Other


Hi All,

I don't know whether they would accept the commission (as I have 
suggested before [1]), but I again suggest the TAG appoint Roy 
Fielding and Pat Hayes to work jointly to present to the TAG a 
resolution to these vexing terminology and semantics issues.

Further, if the TAG were to agree in advance to accept a 
consensus recommendation from them, I think that goodwill and 
intelligence will prevail. I, for one, would agree to the 
recommendation.

Thanks, Mike

[1] 
http://www.mkbergman.com/426/the-shaky-semantics-of-the-semantic-web/

noah_mendelsohn@us.ibm.com wrote:
> Tim Berners-Lee wrote:
> 
>> I would like to see what the documents all look like if edited to 
>> use the words Document and Thing, and eliminate Resource. That's my 
>> best bet as to two english words which mean as close as we can get 
>> to what we want.
> 
> Yes on "thing"; as you've heard me say from time to time, I continue to 
> have reservations about the word "document".  No doubt "document" seems 
> less intimidating than IR, and is often suggestive of what we mean. 
Still, 
> I think it's actually too narrow, or at least troublingly ambiguous.
> 
> Maybe I've hung out with the XML crowd to long, but one of the things 
that 
> I tend to think of as characteristic of "documents", as opposed to 
"data", 
> is that they tend to have ordered content.  The order of the paragraphs 
in 
> this email document is significant.
> 
> Now, let's say that I have a resource (thing) that consists of an 
> unordered set of stock quotes.  Each quote is a {company name, price} 
> pair, but there is no inherent or prefered order for the quotes.  As a 
> practical matter, any particular representation sent through HTTP will 
> likely have the quotes in one order or another, but that order is an 
> artifact of the representation technology, just like the angle brackets, 

> whitespace or other delimiters for the quotes.  I representation with 
the 
> order changed would be equally appropriate.
> 
> Question: is it OK to return a 200 for this bag of quotes?  I hope so. 
Do 
> we call an unordered bag of quotes a document?  Well, we can, but I 
think 
> it's a stretch. 
> 
> I played some role in suggesting the term "Information Resource" to the 
> TAG in 2004.  I acknowledge and regret that few seem to be pleased with 
> it, but let me at least remind those who don't know how it came about. I 

> wanted to find a term that more clearly covered cases like the one above 

> (and relational tables, trees, graphs, and other data-like 
abstractions). 
> It occurred to me that Claude Shannon, in his theory of Information, 
> seemed to deal with exactly the sorts of abstractions for which we 
wanted 
> to allow 200;  I.e., those that could be represented by a sequence of 
> bits, of agreed encoding.   Can you apply Shannon's theory (which is 
> really about error rates and reliablity) to attempts to transmit the 
text 
> of the Gettysburg address?  Yes, presuming sender and receiver can agree 

> on an encoding.  Can you apply Shannon's theory to my bag of stock 
quotes 
> or to the information filling the (unordered!) rows and columns of a 
> relational table?  Yes.  Can you apply it to attempts to somehow 
transmit 
> me, the three dimensional living TAG member with the unruly hair?  No. 
So, 
> it's just the distinction we want.
> 
> If everyone decides that on balance "document" is the lesser of the 
evils, 
> I suppose I could go along with it, but I don't think it's quite right. 
If 
> we use it, we should at least try to explain what's really covered and 
> what's not.  I still think that IR, in the sense intended, is closer to 
> what we really mean.  (If I have to return a 303 for a bag of stock 
> quotes, I'm going to be annoyed.) 
> 
> Noah
> 
> --------------------------------------
> Noah Mendelsohn 
> IBM Corporation
> One Rogers Street
> Cambridge, MA 02142
> 1-617-693-4036
> --------------------------------------
> 
> 
> 
> 
> 
> 
> 
> 
> Tim Berners-Lee <timbl@w3.org>
> Sent by: www-tag-request@w3.org
> 08/01/2009 10:14 PM
> 
>         To:     Pat Hayes <phayes@ihmc.us>
>         cc:     "Roy T. Fielding" <fielding@gbiv.com>, Larry Masinter 
> <masinter@adobe.com>, Julian Reschke <julian.reschke@gmx.de>, Mark 
> Nottingham <mnot@mnot.net>, W3C TAG <www-tag@w3.org>, (bcc: Noah 
> Mendelsohn/Cambridge/IBM)
>         Subject:        Historical - Re: Proposed IETF/W3C task force: 
> "Resource meaning" Review of new HTTPbis text for 303 See Other
> 
> 
> 
> On 2009-07 -20, at 16:27, Pat Hayes wrote:
> [...]
> 
> . But this thread started because HTTPbis explicitly disagrees with RFC 
> 3986 on what a resource is. Surely these various documents should at 
least 
> agree on their uses of the basic technical terminology.
> 
> I agree. 
> 
> Historically, URIs were used to point to thinks like web pages and files 

> and movies, on the web, useful documents, or "online resources" in the 
> sense of useful things out there. FTP. Gopher and HTTP sites served up 
> various types of online resources.  People got used to 
http://example.com/ 
> being a web page and http://example.com/#contact being an anchor within 
> it.
> 
> The Online Information community, into whose domain the web stuff was 
put 
> for standardization at the IETF, referred to these things like web pages 

> as resources, and changed the original "D" for "Document"  in "UDI" to 
> "R".
> Some felt that resource was more appropriate term, maybe because 
> "document" wasn't wide enough to include things like movies.
> 
> Now the URI spec actually allowed URIs for completely different things, 
> such as telephone end points, and wisely the URI spec does not make any 
> arbitrary constraint on what a resource should be, especially a resource 

> denoted by a URI in a new scheme to be invented.
> 
> Meanwhile, the HTTP spec was polished and elaborated basically as a 
> document delivery system, plus other methods for updating documents, 
plus 
> POST.  (POST started historically as a way of introducing a new web page 
y 
> posting it to a list, just as in NNTP.  It then almost immediately got 
> used as a catch-all extension method. I will ignore it in this 
overview).
> 
> There was no real definition of what a resource or document was -- maybe 

> because it seemed obvious. The HTTP spec did not even specify whether 
the 
> URI denoted a person or a document about them, it just explained that 
the 
> thing returned representation of the resource.
> 
> Roy's REST work then came along to formalize HTTP as REST and declared 
> that a resource was a time-varying mapping between URI and 
representation. 
> That was good enough for HTTP. It didn't have enough for the AWWW, when 
it 
> came along, to be able to describe how the web worked.
> 
> In fact, the AWWW document, to explain how to use the web properly, had 
to 
> add in a bunch of stuff about the social expectations -- things like, 
yes, 
> the mapping from URI to representation is a function of time, but not 
just 
> any old one -- a random function is not typically very useful. There are 

> expectations about it can change with time.  Persistence, consistency, 
> with various common patterns which allow the web to be a useful medium. 
> The AWWW decided to use the term "Information Resource" for a thing like 
a 
> web page which contains information, and "Resource" for any old thing at 

> all.
> 
> So HTTP and the REST work of was done very much in this space of 
document 
> delivery, editing and update.  There was no philosophical need to talk 
> about what he URI denoted (the person, the web page about the person) 
> until RDF came along, when there was an immediate need.
> 
> When RDF was first developed, it was motivated by the need for data 
about 
> resources very much in the online information sense: data about 
documents, 
> or 'metadata'.  In fact it was designed to be able to describe anything, 

> but many early users of RDF referred to it as metadata technology.  RDF 
> used the word "resource" rather awkwardly in fact as it turned out.  In 
> the beginning, many of the things being described were documents, and so 

> the online information meaning of resource made sense. But in fact in 
RDF 
> the resource was allowed to be anything at all. A class, rdf:Resource 
even 
> used the term as the universal class of all things.  A little later, the 

> Web Ontology Language decided to use Thing for that. 
> 
> RDF came along in what I think was a neat way.  It used completely 
> existing web protocol extension devices to introduce a new system which 
> was fundamentally different from the old HTTP+HTML one.  The HTML web 
was 
> a hypertext model, which pages and anchors. The RDF model was a 
knowledge 
> representation one of arbitrary things.  It did this by using the fact 
> that a new language can define whatever it likes as what a local 
> identifier denotes.  A graphic language might use local identifier to 
> denote lines and points. HTML used local identifiers to identify 
hypertext 
> anchors.  RDF used them to identify arbitrary concepts, people, 
whatever.
> 
> The web architecture gave all these languages a common way of building a 

> global identifier for the thing denoted by a local identifier in a given 

> document.   The semantics of the hash sign are defined web-wide to mean 
> that "a#b" can be used to denote whatever is denoted by "b" in the 
> document denoted by "a".
> 
> Worked a treat.  At the beginning of the century, people played around 
and 
> gave all kinds of things URIs like "http://example.com/foo.rdf#color". 
> Some of us did lots of work and made all kinds of systems which 
exchanged 
> and integrated data in this way.
> 
> Two snags occurred, as the years passed.  One was that a bunch of RDF 
> users got the fact that it was good to use HTTP URIs, but didn't get the 

> fact that you should put the foo.rdf online so that people can look up 
> what #color means in it.  And as they didn't do that, they didn't 
actually 
> bother with the "#" at all.  The second fly in the ointment was that 
some 
> people wanting to use RDF for large systems found that they didn't want 
to 
> use the "#". This was sometimes because the number of things defined in 
> the same file was too low (like 1) or too large (like a million) and it 
> was difficult to divide up the information into middle-sized chunks. Or 
> they just didn't like the "#" because it looks weird. But for one reason 

> or another people demanded the right to be able to use 
> http://example.net/people/Pat to denote Pat rather than a web page about 

> Pat. 
> 
> This potentially led to huge failures in the whole RDF world, with 
systems 
> already built which just used   "http://example.net/people/Pat" to 
> identify the document whether you like it or not.
> I among others pushed back against using non-hash URIs for arbitrary 
> things his but eventually gave in.
> 
> So in response to this, the HTTP protocol was, in fact, changed.
> 
> The spec wasn't changed.  The spec editors were not brought on board to 
> the new model.  The spec was interpreted.  The TAG negotiated in a way a 

> truce between the existing HTTP spec, RDF systems, and people who wanted 

> to use HTTP URIs without "#" to identify people.  That truce was 
> HTTPRange-14, which said that yoiu don't a priory know that a hashless 
> HTTP URI denoted a document, but if the server responded with a 200 then 

> you did, and you had a representation of the document.   If you did a 
get 
> on one of these new URIs which identified things were not documents 
> (people, RDF properties, classes, etc) them the server must not return 
> 200, it can return 303 pointing to a document which explains more.
> 
> So the HTTP protocol was, effectively,  changed.  The HTTP protocol as 
> extended now allows HTTP to be used not only for Documents but for 
> arbitrary Things.  It extends the set of things which you can ask a web 
> server about from documents to anything.  It isn't a very bad design, 
nor 
> very beautiful.  Other designs would have worked, but that one was the 
> only one which didn't have major problems for some community.  It could 
be 
> extended, but basically it works. It would be very expensive to reverse 
it 
> in terms of systems which have been deployed.
> 
> It is also very expensive to go on debating it as though it is an open 
> issue. It is reasonable to try to make the documents more consistent. 
> 
> Anyway, that is a simplified version of the history of all this as I saw 

> it. 
> 
> I would like to see what the documents all look like if edited to use 
the 
> words Document and Thing, and eliminate Resource. That's my best bet as 
to 
> two english words which mean as close as we can get to what we want. 
Note 
> however that the web is a new system, a design in which new concepts are 

> created, so we can't expect english words to exist to capture exactly 
the 
> concepts. So we take those nearby and abuse them as little as we can as 
> far as we can tell at the time, and then write them in initial caps to 
> recognize that that is what we have done.
> 
> Tim 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 

-- 
__________________________________________

Michael K. Bergman
CEO  Structured Dynamics LLC
319.621.5225
skype:michaelkbergman
http://structureddynamics.com
http://mkbergman.com
http://www.linkedin.com/in/mkbergman
__________________________________________
Received on Tuesday, 25 August 2009 15:03:49 UTC