- From: Booth, David (HP Software - Boston) <dbooth@hp.com>
- Date: Sat, 25 Aug 2007 22:44:04 -0400
- To: "Leo Sauermann" <leo.sauermann@dfki.de>, "Williams, Stuart (HP Labs, Bristol)" <skw@hp.com>
- Cc: "Richard Cyganiak" <richard@cyganiak.de>, "Susie Stephens" <susie.stephens@gmail.com>, "Ivan Herman" <ivan@w3.org>, <www-tag@w3.org>
Lee & Richard, I looked at your latest version of http://www.dfki.uni-kl.de/~sauermann/2006/11/cooluris/ and again, kudos for doing this. It is a very helpful document, and very readable. I do have a few suggestions though. Most are relatively minor, but one is quite important (the media type limitation of hash URIs). 1. I notice that you sometimes use the word "resource" to specifically mean "non-information resource": [[ There should be no confusion between identifiers for documents (URLs) and resource identifiers. ]] and sometimes you use it more broadly: [[ Another question is where to draw the line between traditional web documents and other, non-document resources. ]] I think it would be best to use the word "resource" only in the board sense (consistent with the WebArch use of the term), and use a more specific term when you wish to refer specifically to "non-information resources". The presentation does a nice job of motivating the difference between information resources and non-information resources, so I personally would suggest just using "non-information resource" when that is what you mean, even if it is rather cumbersome. 2. One piece of advice that I only see implictly in Figure 2, but I think is worth stating more visibly: In setting up content negotiation to serve or redirect to RDF or HTML, the human-oriented version (HTML) should be the default. This way naive dereferencing will yield something human oriented. (It is much more reasonable to require Semantic Web apps to set the Accept headers properly to receive RDF and to require naive human users to set them properly to receive HTML.) Furthermore, the human oriented result should indicate both ways that the RDF version can be obtained: by setting the Accept headers properly; or via the RDF-specific URL, which should be provided. 3. Regarding this: [[ According to W3C guidelines, we may have a web document if all its essential characteristics can be conveyed in a message. This is not a very precise definition. Our recommendation is to err on the side of caution: Whenever an object of interest is not clearly and obviously a document, then it's better to use two distinct URIs, one for the resource and another one for the document describing it. ]] I suggest *not* quoting the current WebArch definition of "information resource", because it is so badly flawed that I think it does more harm than good to repeat it. I think the exposition prior to this point has already done qutie a good job of explaining the difference between information resource and non-information resource, so I think it would be better to just change these sentences to something like: [[ Since the current definition of "information resource" is not entirely clear, our recommendation is to err on the side of caution: Whenever an object of interest is not clearly and obviously a Web page, then it's better to use two distinct URIs, one for the resource and another one for the document describing it. ]] (I changed the word "document" to "Web page" to more inclusive of dynamically generated content, but you might find still better ways to express this.) 4. I note using content negotiation with 303-redirect in the manner you describe means that there is no single URI that (abstractly) denotes the information resource as a whole that describes the resource in question. (In other words, another way to do this would be to 303-redirect to a generic information resource URI, and then use content negotiation when that URI is accessed.) I don't see a big harm in this, and it does eliminate an extra dereferencing step 5. These statements are not quite correct: [[ This means a URI that includes a hash cannot be retrieved directly, and therefore cannot identify a web document. We can use them to identify other, non-document resources, without creating ambiguity. ]] The meaning of the fragment identifier is determined by the media type of the representation that is returned. Depending on the media type, the URI with the fragment identifier *might* be able to identify other, non-odcument resources. But for HTML, for example, the fragment identifier is used to identify a portion of the returned document. This has significant consequences for the use of hash URIs. I suggest rewording this to something like: [[ The part before the hash symbol ("#") is called the racine. Thus, in some cases the URI with the fragment identifier can be used to identify a non-document resource, and clients can dereference the racine to find an associated web document that describes the non-document resource. However, there is a catch: the meaning of the fragment identifier depends on the media type that is returned when the racine is dereferenced. If RDF is returned, then the fragment identifier can identify an arbitrary non-document resource.[ref: http://www.ietf.org/rfc/rfc3870.txt] But if HTML is returned (for example), then the fragment identifier designates an element within the document,[ref: http://www.ietf.org/rfc/rfc2854.txt] and thus the URI with the fragment identifier cannot be used to identify an arbitrary non-document resource. Consequently there is a trade-off in using hash URIs to identify non-document resources, because the hash URI limits the future media types that you can serve. If you only ever wish to serve RDF, then hash URIs will work fine. But if you think you may someday wish to serve HTML in addition or instead, then you should use 303 URIs. ]] 6. In your example of 303 URIs, http://www.acme.com/id/alice 303-redirects either to http://www.acme.com/data/alice (for RDF) or http://www.acme.com/people/alice (for HTML). Since the RDF and HTML versions are in some sense intended to provide different representations of the same information (either for machine or human consumption), I would think it would be administratively more natural to instead use something like: http://www.acme.com/description/alice.rdf and http://www.acme.com/description/alice.html for the two document URIs. Obviously the choice is up to the adminstrator, but I thought I would mention it, because the example looks a little odd to my eyes in this way. 7. Section 4.3 ("Choosing between 303 and hash") needs to mention the media type limitation of hash URIs explained in comment #5 above. Again, thank you for this very valuable contribution to the community. David Booth, Ph.D. HP Software +1 617 629 8881 office | dbooth@hp.com http://www.hp.com/go/software Opinions expressed herein are those of the author and do not represent the official views of HP unless explicitly stated otherwise.
Received on Sunday, 26 August 2007 02:48:19 UTC