Re: Uniform access to descriptions from Tim Berners-Lee on 2008-04-12 (www-tag@w3.org from April 2008)

From: Tim Berners-Lee <timbl@w3.org>
Date: Sat, 12 Apr 2008 10:12:30 -0400
To: wangxiao@musc.edu
Cc: Pat Hayes <phayes@ihmc.us>, Michaeljohn Clement <mj@mjclement.com>, "www-tag@w3.org WG" <www-tag@w3.org>, noah_mendelsohn@us.ibm.com, Jonathan Rees <jar@creativecommons.org>, Phil Archer <parcher@icra.org>, "Williams, Stuart (HP Labs, Bristol)" <skw@hp.com>
Message-Id: <16B81298-D5C0-4F5D-93CE-B23C47F0BECC@w3.org>
On 2008-04 -12, at 06:08, Xiaoshu Wang wrote:

>
> Darn and thanks, Pat. I wish my English is that good.
>
> Xiaoshu
>
> Pat Hayes wrote:
>> Reading this exchange (below), I think I might be able to make  
>> Xiaoshu's case for him. (Xiaoshu, if I have misrepresented you at  
>> all, please forgive (and correct) me. But I got to this point from  
>> your recent emails (on and off list), so even if Im wrong, you have  
>> to bear some of the responsibility :-)
>>


Ok, thanks Pat, Whether or not you were successful in representing  
what Xiaoshu meant, you have put the argument on the table.

>> The central point is that now that we have the technology and ideas  
>> of the semantic web available, we have a wider range of ways of  
>> representing, and a richer notion of what words like ''metadata"  
>> mean. If we are willing to take fuller advantage of this new  
>> richness, we make available new ways to do semantic things within  
>> the same overall design of the pre-semantic web.
>>
>> In particular,  awww:represents is a very narrow sense of  
>> 'represents'.

Well, it is a specific part of the architecture which is now well  
defined.  It is a technical term.

>> Perhpas we can allow a wider sense of representation here.

I'd prefer you to use a different term.   We have tried, with Pat's  
guidance, to use terms like 'denote' in ways that the philosophical  
community which came before the AWWW would be happy.  But  
'representation' in the AWWW is used a technical sense, as 'Packet'  
in  the Internet Protocol. It is part of a technical design, and we  
are not free to take it in a wider sense without doing a great  
disservice to the community.

>> The REST story was always that URIs/ identify/ resources, and that  
>> the http response is a/ representation/ of the resource. Nobody has  
>> ever been able to say what exactly counts as a 'resource'.

No one can ever, in English, say exactly what anything my friend.

However, for better or worse, RDF  uses the word Resource to mean  
basically thing, and the AWWW uses Information Resource to mean  
basically document.

You can understand them in two ways.  One is to read the english and  
realize that your use 'thing' and 'document' might not quite match  
that of the writers, and go with the flow until you se how they are  
used, or you can take them as technical terms, and just read them in  
the context of the specs.

>> We already have accepted the idea that a given resource may have  
>> many awww:representations, to be resolved by content negotiation.
>>
>> Now, take that story exactly as expressed,  but let the word  
>> 'identify' mean simply/ denote/ or/ name/,

As I think it does.

>> and allow that the/ resource/ can be something entirely unconnected  
>> to the Internet (such as, say, me), and allow 'representation' to  
>> include not/ just/ the awww:representation relationship between a  
>> byte stream and something like an html web page, but more  
>> generally/ any kind of representation of a thing/, so that an image  
>> of me can be a representation of me, and an RDF description can be  
>> another representation of me, and my home page can be yet another  
>> representation of me - remember, here the resource in question is/  
>> me/, not some information resource. So, what follows from this  
>> vision? Well, it means that your insistence that the RDF and a JPEG  
>> image must be different resources is misplaced. Not that its false,  
>> but it misses the point. Their role here is not as resources, but  
>> as/ representations/. And seen in this light, it seems quite  
>> natural that one might use conneg to decide which of them is most  
>> appropriate.
>>
>> Now, of course, this is not how 'representation' has traditionally  
>> been used in Webarch discussions. It is not awww:representation.  
>> But it is a perfectly good usage of the word 'representation': in  
>> fact, somewhat better than the traditional webarch sense, which is  
>> so special and peculiar as to almost be a distortion.

The same is true of an Internet Packet.  The traditional sense of a  
packet for me really involves physical three dimensional wrapping, and  
almost always brown paper, an often string.   The use of the term  
'packet' for some string .    Technical world is full of such co- 
options of words, and complaining that they don't have their original  
meaning is inappropriate.  Because there IS no english word which is  
perfect, because webarch didn't exist before, it was invented. Like  
concepts in new software systems every minute of each day.  The people  
who chose words to be co-opted do so with the best of intentions, and  
with a success which will clearly vary depending on the audience.    
Others can bemoan an unfortunate choice, but the reader is not, for a  
technical term, in a position to say "actually this means something  
else".  This is how we communicate these days.


>> It requires us to generalize the 'classical' webarch story to allow  
>> a broader sense of '/representation/' and a broader sense of '/ 
>> resource/' and a broader sense of '/identify/'. And I think  
>> Xiaoshu's main point is, let us try doing that, indeed, and see  
>> what happens; and in fact, one gets a coherent, rational story  
>> about how Web architecture should work. It isn't the REST model any  
>> more: it generalizes it to include a much wider range of  
>> possibilities. (We might call it REST++.) It is a Web much more  
>> infused with semantics and descriptions than the current Web, one  
>> which uses its own formalisms (RDF) more architecturally than the  
>> current Web. In this vision, the semantic Web isn't simply an  
>> application layer built on top of the pre-semantic Web, but instead  
>> is something more like an architectural generalization of the pre- 
>> semantic Web, with semantic technology built into its very  
>> architecture all the way down.

We could have done the same thing with the Web on top of the  
internet.  We could have protested that it was unnatural to build  
something which is fundamentally pages on top of something  
fundamentally bitstreams.

The point would be:

"let us try doing that, indeed, and see what happens; and in fact, one  
gets a coherent, rational story about how Internet architecture should  
work. It isn't the inter-network model any more: it generalizes it to  
include a much wider range of possibilities. (We might call it IP++.)  
It is a Net much more infused with pages and links than the current  
Net, one which uses its own formalisms (HTTP) more architecturally  
than the current Net. In this vision, the Web isn't simply an  
application layer built on top of the pre-web Net, but instead is  
something more like an architectural generalization of the pre-web  
Net, with web built into its very architecture all the way down".

It is always a choice.  Just think.  Routing tables in RDF.  In fact,  
DNS in RDF and HTTP is now a very sensible solution, which allowed  
digital signature of DNS using XMLDsig etc and a lot less reinvention.

Two strong arguments against.  1. We can move on more quickly if we do  
not re-invent the lower layers, as the simple invariants which we  
happily assume of the TCP layer in fact take huge amounts of careful  
thought, engineering and administration to achieve.  2. We do not  
arrogantly assume that we will be the only net users doing interesting  
things, so we want to interconnect with other net-using services like  
email and peer-peer protocols and so on.


>> So, here's a typical Web transaction. A URI U/ identifies/ a  
>> resource R, and when U is given to http, the Web delivers a/  
>> representation/ S of R. Typical classical case: R is a website (or  
>> a webpage or a server or an http endpoint, or... but anyway, its  
>> something Internettish), U+http is a route to R and S is a  
>> awww:representation of R, which is typically a byte-for-byte copy  
>> of a file which comprises the bulk of R.  Alternative case using  
>> the more general senses: R is me, U denotes R and S is an RDF graph  
>> describing R, using FOAF. Describing is one way of representing.  
>> Another alternative sense: R is me, U denotes R and S is a JPEG  
>> image of R. Picturing is another way of representing. Now, these  
>> representations aren't awww:representations of me, of course; but  
>> they couldn't/ possibly/ be, since I'm not the/ kind of thing that  
>> can possibly have/ an awww:representation. So if we want to run the  
>> classical story with things like me - non-information resources -  
>> as R, then we/ must/ generalize the classical notion of  
>> 'representation'.
>>



>> What these alternative cases have in common, and where they both  
>> differ from the traditional one, is that the Web 'thing' that is  
>> located by U+http and which returns the representation S simply  
>> isn't mentioned. Its not part of the story at all: it's not the  
>> resource, S doesn't represent it, and its not what the URI  
>> identifies/denotes. Its just part of the Web machinery, a  
>> computational thing whose task is to transmit S when requested to  
>> do so. It has a relationship to R, of course, but rather an  
>> indirect one: it is a thing that delivers representations of R,  
>> using http. We might call it a/ storyteller/ for R. R might have a  
>> whole lot of storytellers, each capable of telling different kinds  
>> of story about R.  The classical case is where R is its own  
>> storyteller. This is different from the classical REST/webarch  
>> story, indeed: but then, as soon as we allow URIs to identify  
>> things that can't be accessed by transmission protocols, the  
>> classical story stopped working. We have to broaden our horizons.  
>> But notice that it follows the same basic description as the  
>> classical story, just using the terminology more broadly.

So the pictures and the web pages and the RDF documents are not first  
class objects, and do not have names.  It certainly is not the web.   
Sure, you could build it.  Semantics Transfer Protocol. It would be a  
interesting study.

I content that it actually not very useful to get back S without  
knowing what its relationship to R is.  Of course, if it is RDF about  
R it can say of its own accord.  If it is a JPEG we don't know whether  
it is R or is a JPEG encoding of R or a single frame taken from R or a  
picture of R one night in a bar.

Two designs suggest themselves.   In one, the relationship is  
negotiated.   The client sends a request including a header something  
like:

Accept-response: pictureOf, meaningOf, directionsToHouseOf, stuffAbout

and the server responds including a header something like

Response: pictureOf  ; env="bar"; time="00:26"

The other design is that the returned thing is always just a set of  
assertions by the server,  explaining the relationships involved.   If  
you like, you can attach anything but the cover note has the semantics  
of a message from the publisher to the reader.  It might say things  
like "The R you requested is a person, their name is Archibald, and we  
know of two photos, the first being a mugshot and the second a holiday  
snap"

The trouble is there is no way for the client to direct the search.
Suppose the client wants to to get a mugshot of R.   This may or may  
not have a URI itself, Rm.
In either case, the client can ask as long as it likes but may always  
get back the information that Rm is a photo of R.   It asks for a  
JPEG, and gets back a picture of the relationship between Rm and R as  
circles and arrows.  Well, that is in the new world a representation  
of Rm, so I guess it has to be content.  Or maybe all photos are  
served in http: space, not stp: space.


>> In this view, then, content negotiation is a much wider topic than  
>> it has traditionally been. We are dealing with a much wider notion  
>> of what a 'resource' is, and a much wider notion of what a  
>> 'representation' is. Some resources have/ all kinds/ of possible  
>> representations. So yes, we have to be prepared to go beyond  
>> 'accepted and expected usage'. Who would have thought otherwise?


Well, the interesting thing about IP is that it built on top of the  
Ethernet system without going beyond Ethernet's 'accepted and expected  
usage' one single bit.   And the web was built on top of TCP/IP  
without going outside TCP/IP's 'accepted and expected usage' enough  
for use to actually modify TCP/IP at all.   So an agent capable of  
induction might well have thought otherwise.


  	http:
Internet:		//www.w3.org/
Web: 				People/Berners-Lee/card
SemWeb:									#i

When you look at the URI  you can see the archaeology, you can count  
the rings of the tree.  You can see how each layer leverages the  
previous layer.  #i denotes a person as described by a document People/ 
Berners-Lee/card in a domain controlled by the owners of www.w3.org.   
The semantic web in this way builds on a lot of existing social and  
technical architecture.

Feel free, Pat(Xiaoshu), to build such an stp: system.   Feel free to  
use it to inform the design of HTTP and maybe help us adjust HTTP.  
But  do not feel free to misrepresent what technical terms in the web  
architecture mean -- you have to pick other.

>>
>>
>> Pat
>>
Received on Saturday, 12 April 2008 14:13:09 UTC