- From: Tim Berners-Lee <timbl@w3.org>
- Date: Mon, 1 Oct 2007 00:26:23 -0400
- To: Alan Ruttenberg <alanruttenberg@gmail.com>
- Cc: wangxiao@musc.edu, Misha Wolf <Misha.Wolf@reuters.com>, W3C-TAG <www-tag@w3.org>, semantic-web-ig list <semantic-web-ig.list@reuters.com>
On 2007-09 -30, at 00:06, Alan Ruttenberg wrote: > > Richard, I am concerned, in this question, with content > negotiation, although some of the same questions arise with > redirection. Also, I am also not concerned with the http browser > activities. My concern is the Semantic Web, and that the sort of > answers and definitions which are offered for the traditional web > do not seem to work, or at least are not understandable to me in > the context of the semantic web. > > Xiaoshu, > > You wrote: >> Content negotiation doesn't change the URI. The server returns >> different representations for a particular request depending on >> the MIME type and q score, but all the representation is under the >> same URI. > > Let's examine the following situation: > > 3 URIs > http://example.com/depict/alan > http://example.com/depict/alan.jpg > http://example.com/depict/alan.png > > http://example.com/depict/alan does content negotiation, and > depending on whether the agent wants jpeg or png, redirects to one > of the two other URIs. The bits for the jpg and the png are in a > file on the server's file system. > > The question is what http://example.com/depict/alan is. "http://example.com/depict/alan" is a URI identifying "Generic Resource". I wrote about this in <http://www.w3.org/DesignIssues/Generic> IIRC (I am on a plane) years ago. The <http://example.com/depict/alan> resource is generic in that it isn't specified to the to level of what content-type is returned. Genericness of resource is not always about content-type, it can also be with respect to version, to natural language used. Most URIs on the web are generic in one or more directions. There are not always individual URIs available for the specific resources, but often there are. Generic resources are valuable concept as most of the time we don't want to just refer to a specific version in a specific format and a specific language. > By my understand of your instruction, a web server should, in some > circumstances, when asked for the resource identified by http:// > example.com/depict/alan sometimes return the bits from the jpg > document, and sometimes return the bits from the png document. > These two different sets of bits are both "identified" by the same > URI, http://example.com/depict/alan, each of which should be > considered a representation of http://example.com/depict/alan > > Similarly, when the web server is asked for the resource identified > by http://example.com/depict/alan.jpg it should return the bits > from the jpg document and we say that the resource returned is a > representation of http://example.com/depict/alan.jpg Well, we don't use those words that way. A resource is not returned. A Representation is returned. A representation is a structure of a) the HTTP headers which include one Content-type: image/jpeg and b) the bits of the picture. (This is the term in Roy Fielding's, PhD thesis, and the TAG AWWW). The word 'representation' here is used in a technical sense, like 'packet' in IP, or 'internet message' in SMTP or 'completed return' in the IRS 1040 filing instructions. > Now, I step back, and move into a language which I understand > better. I consider the jpg and png files documents in the > traditional and easier sense - they are a series of bits. They > won't change. I consider a "copy" any other document that has > exactly the same series of bits. These are specific documents. Non-generic resources. Any Representation sent back for them will always have the same bits. OK. Strictly, I wouldn't say the document *is* the bits. The document is still a picture, a very specific one. The Representation still needs the headers as well as the bits, as you can't render the picture, in general, without knowing what format it is in. The architecture is such, anyway, that you always send a Representation. > I am thinking that I would like URIs to to identify this document. > Naively, perhaps, I choose http://example.com/depict/alan.jpg, and > http://example.com/depict/alan.png. If anyone asks me what I mean > by resource in this case I will say: "By resource, I mean document, > in the sense described". Ok > If they ask me what I mean by representation in this case, I will > say: "I don't know. Ask Xiaoshu". Hope that is clear now. > From this mindset, I will at some time later encounter http:// > example.com/depict/alan. Upon accessing it, I get a series of bits. And metadata. You get a Representation of the picture in the HTTP sense. > Upon examining the bits, I find that they are the same set of bits > as the document. I say, oh, http://example.com/depict/alan > identifies the same document as http://example.com/depict/alan. > Conclusion http://example.com/depict/alan is an alias for http:// > example.com/depict/alan [.jpg] Well, you have got matching representation bits (and content-type) for each. This does not mean they are the same resource. One, the generic one, may well be sent with a Vary: Accept" header meaning that the result you will get for this can vary. It may also have a "Content-Location: alan.jpg header to let you know a URI you can later useif you want to refer to the specific resource. So the HTTP representation in this case allows you to build a pretty good picture of what is going on. > Some time later I access http://example.com/depict/alan with a > different agent. Upon accessing it, I get a series of bits. Upon > examining the bits, I find that they are the same set of bits as > the document. I say, oh, http://example.com/depict/alan identifies > the same document as http://example.com/depict/alan.png. Conclusion > http://example.com/depict/alan is an alias for http://example.com/ > depict/alan > Now you have a picture of the generic resource and two different specific ones. > But wait, it is worse, I now have a URI that seems to break the > rules, and identifies two documents > No, if you understand generic documents then you see that different URIs are useful for the generic and specific versions,a and all is well. In general, you are getting at ... is this a good thing, and what should the Semantic web do when referencing the document? The answer is, almost always refer using the generic URI. Most of the things you say about it, like licensing, or what it depicts, are true of the image as a generic thing, independent of what image formats the server might have available now or in the future. When you embed the image in a hypertext page, you use the generic URI, so that browsers with different capabilities will still work. This is standard practice for the images on the W3C site, for example. If you actually want to store the relationship between the various generic and specific resources in RDF, then there is a little ontology I made you might find useful <http://www.w3.org/2006/gen/ont>. (Also specific resources .n3 and .rdf but please refer using the generic URI :-). An example of its use is in <http://www.w3.org/2007/ont/meta>. Tim
Received on Monday, 1 October 2007 14:45:40 UTC