- From: Richard Cyganiak <richard@cyganiak.de>
- Date: Sat, 9 Aug 2008 10:12:37 +0100
- To: wangxiao@musc.edu
- Cc: Sebastien Lambla <seb@serialseb.com>, "T.V Raman" <raman@google.com>, john.kemp@nokia.com, www-tag@w3.org, kidehen@openlinksw.com, tthibodeau@openlinksw.com
Xiaoshu, On 8 Aug 2008, at 13:20, Xiaoshu Wang wrote: >> A resource is anything named by a URI. >> >> A representation is a <bitstream, MIME type> tuple. > It sounds simple but not really. For instance, is "http://dfdf.inesc-id.pt/tr/doc/web-arch/img/fig2 > " a resource or a representation? It's a URI, a string starting with "http://". You probably meant to ask what it *identifies*. Well, it evidently identifies a resource, because that's what URIs do. I don't know what that resource is supposed to be, so I cannot tell if it is a representation or not. Only the person who minted the URI knows, and has not chosen to tell us explicitly. > A *representation*, in my opinion, is what is delivered to the client, Too simple. A representation is a <bitstream, MIME type> tuple. Representations are sometimes delivered to clients, sometimes sent from the client to the server, sometimes stored in a cache and so on. It's the real bits going over the wire. > a *resource* is whatever the provider intends it to be. Yes. > They are always different - not in the sense if they are *bitstream* > or not. They usually are, but not by definition. A URI can name anything. So I can name a particular <bitstream, MIME type> tuple with a URI, and hence the URI identifies a resource that is a representation. (I'm not saying that this is a particular useful thing to do. In fact, don't do it.) > A *representation* of a resource denoted by a URI doesn't have to be > delivered electronically. This is the often wrongly conceived idea > that http-URI must be bound to HTTP protocol. If I bind a URI to a > postal service as its transportation protocol, you (in principle) > can go to a postal office to request "http://dfdf.inesc-id.pt/tr/doc/web-arch/img/fig2 > " and a hardcopy print, which is NOT a bitstream, can be delivered > to you. Do you consider that picture a resource or a representation? The hardcopy picture is a rendering of the <bitstream, MIME type> tuple on paper. The hardcopy picture is neither a resource (unless someone assigns a URI to it, but again, let's not go there), nor is it a representation (because it's not a <bitstream, MIME type> tuple, but a piece of paper with ink on it). This is irrelevant to Web architecture, by the way. The Web is an abstract machine that, among other things, emits representations (<bitstream, MIME type> tuples) in response to GET requests. What we do with the bits afterwards -- render them on a screen, print them on paper, store them on a disk -- is outside the scope of Web architecture and I don't see why we should talk about this. >> That's a simple and objective distinction. The interesting and >> subjective question is how to best model an application using those >> two modelling primitives. >> >> There are two schools of thought on this. One school maintains a >> distinction between “documents” and “things described in the >> documents” in their modelling; the other school says that this >> distinction is unnecessary. >> >> The former modelling has been elevated to an axiom of Web >> architecture by the httpRange-14 decision. It has many advantages >> over the latter (cleaner handling of metadata, enables grouping of >> many descriptions in a single document, ...), and it has some >> disadvantages (more complex, ...). These have been discussed >> endlessly and I have no interest in resurrecting that debate. >> >>> There are only two choices. (1) As T.V Raman said, don't make any >>> distinction between them. >> >> The distinction is made in the specs; and it's made by a large and >> significant part of the Web community (see REST). Raman might not >> consider it important, and that worries me a bit, but it doesn't >> diminish the importance of the distinction. >> >>> (2) As I have always proposed, to make an absolute distinction. >>> That is: to think every URI denotes a *resource* >> >> No one disagrees with this. >> >>> and what is dereferenced from the URI is the *representation* of >>> that resource. >> >> Not quite. A representation is what you get back when you do a GET >> on the resource, or what you send when you do a POST/PUT. > I am not sure how many people will agree on the latter part. I > cannot. People talk much more about GET than about POST and PUT, but I'm pretty sure that I have correctly captured the spirit of the HTTP spec, of Roy's Chapter 5, and of AWWW when I say that we can change a resource's state by submitting a representation using POST or PUT. If you disagree, I'm pretty sure that we are simply using language differently, and you should probably use another term instead of “representation” for what you have in mind. (“state of the resource”?) Repeat after me: Representations are <bitstream, MIME type> tuples. >> Dereferencing is the process of “reaching through the network” in >> order to perform one of the supported operations on a resource. >> >>> To think whether something is *in* a document or not is just a >>> form of self-contradiction because the goal is to make the web >>> *self-descriptive*. Hence, a document (or resource) is both in >>> and not in itself. >> >> What do you mean when you say “something is in a document”? I can >> understand the phrase “something is described in a document”. >> Obviously a document can describe itself. I don't see the >> contradiction. > I intended it the same way you described that "something described > inside a document". I am trying to understand what you mean that > "303 redirects are about creating URIs for “things described inside > documents”. Do you mean if something talks about itself, it should > 303 redirect? Or something else? I mean something else. I have this notion in my head that the Web is a collection of documents, and a web document is not the same as the things the web document talks about. Hence it's better not to use the same URI for a web document and the things the web document talks about (except where it talks about itself). I can't state it any simpler than this. I consider it self-evident. If you don't agree, I give up trying to communicate this idea and we just have to accept that we live in different realities ;-) >> [snip] >>>> some_resource >>>> | >>>> +--303--> description_of_some_resource >>>> | >>>> +--Content-Location--> >>>> description_of_some_resource.{html|rdf} >>>> >>>> That's the clean and proper way of combining the 303 approach >>>> with content negotiation! >>> It is *clean* only when the distinction of *resource* vs. >>> *representation/description* is unambiguous, which hardly is. >> >> Those who use the approach described here simply make a modelling >> distinction between documents and the things described in the >> documents. That distinction *is* unambiguous. (But it is >> subjective.) The described approach is “clean” in the sense of HTTP >> interactions. And it is “clean” in that it enables the modelling >> style described above. > But the web is about facilitating ad hoc communication. If you have > an unambiguous but subjective distinction and I have mine? Is it > going to be unambiguous or not when we intend to communicate with > each other? I can unambiguously communicate my subjective choice, and you can recognize the difference in our choices and work around it. No technical solution will protect you from subjectivity on the Web. >>> In either case, i.e., the (1) and (2) solution mentioned above, >>> 303 is unnecessary. Sure, it does no harm. But it does slow down >>> the web and our goal should be to make the web more efficient but >>> less. >> >> That's why I keep insisting that you should use hash URIs, which do >> not exhibit this downside, instead of the 303 approach. The >> solutions mentioned above are for those who have decided to use >> 303s anyway, despite the well-known downsides. > First, there is reality issue - such as dublin core etc., already > uses slash. If the additional redirect becomes a serious performance problem, then the 303 users will be slowly losing linkshare to hash-based alternatives. Let the market fairy sort it out. > Second, there are clear use cases where #hash URI is not > appropriate. Consider the document size, if a domain vocabulary, > such as that of SNOMD, will be using #URI. This is not an issue. You can do <document1#it>, <document2#it> and so on. You can chunk your documents any way you like. Granted, it's less flexible than 303 redirects. > Third, there is still the issue of the nature of resource because > what a hash URI denotes when there are multiple representation/ > variants, isn't clearly defined. If a (generic) resource say http://example.com/gr > has two representations, an RDF and an HTML one, that can be get > via conneg. Let's give each of them a URI http://example.com/gr.rdf > and http://example.com/gr.html. What will be http://example.com/ > gr#a denoting? It will denote whatever the variants say it denotes. It's possible to coordinate the variants to make them communicate the same idea. But you are right, this is a somewhat messy area and the different specs involved don't answer all questions. I think however that we have a quite clear understanding of what the desirable answers would be, that is, what the specs *should* say to make things work out. > And what is the relationship between http://example.com/gr.rdf#a and http://example.com/gr.html#a > ? There is none, unless explicitly stated. > This is a muddled area, which I hope TAG can find time to give some > recommendations. +1. Best, Richard > > > Regards, > > Xiaoshu
Received on Saturday, 9 August 2008 09:13:15 UTC