- From: Tim Berners-Lee <timbl@w3.org>
- Date: Sat, 23 Feb 2008 22:25:37 -0700
- To: public-sweo-ig@w3.org
- Cc: tag@w3.org
I am commenting on the Working Draft of 17 December 2007. http://www.w3.org/TR/2007/WD-cooluris-20071217/ The comments vary in weight, but I keep them in document order. My 30 comments are marked by **. I hope they make sense. I think this is an important document. It contains a large amount of very valuable material, a few places which are confusing, and a small number places which are, I believe, actively misleading. There are also one or two places where I disagree about the recommendation it makes. On the whole, though, the document is important and I hope energy is found to incorporate these comments. timbl ______________________________________________ 1. Introduction ** Suggest add a reference to the N3 Primer as an introduction to RDF which some find easier to get into. 2. Last para before 2.1. ** Delete the sentence "In short, to locate a Web document—hence the term URL (Uniform Resource Locator)". (Don't go there. Don't try to distinguish between names and locations. See for examplehttp://www.w3.org/DesignIssues/NameMyth.html) ** I would note that "URL simply identifies whatever we see when we type it into a browser" is simplistic, as users are aware of the difference between links and permalinks in a blog for example. 2.1 ** After "HTTP/1.1 200 OK Content-Type: text/html Content-Language: en" add "Content-Location: alice-en.html " ** Start a new paragraph at "...English <p> Content ..." ** Add text to show how people how conneg works and how the client understands the content-type specific URI. ** Technical bug: The example uses 302 Found to redirect according to the Accept: headers. This is *not* advisable IMHO. It uses an extra round trip to no advantage. Conneg should be done directly. I suggest replacing this example with one without the 302 "twist". 3. URI for Real-world objects ** In general, remove the term "non-information resource" from the entire document. Replace it with "thing". It is wrong. It is used misleadingly to mean "A thing, which is not necessarily an information resource". ** It would I think in document like this be best to stick with "web document" instead of "information resource" too, but that is just for readability. It is already done in places. ** Delete "We call all these real-world objects or (according to WWW- Arch) non-information resources." (It is a bad term, as explained above, and the AWWW does not use it at all). 3 .. Box "1 Be on the web". ** Important architectural philosophical point. Replace "Machines should get RDF data and humans should get a readable representation, such as HTML." with "Machines, and humans through user agents, should get data in RDF (and related standards). In some cases, it may be useful to provide a view of the data in HTML for users with conventional web browsers without data functionality" ** Add text after the box. "This document describes ways of serving both raw data and hypertext views of data. Remember that the most important duty of the provider of data is to provide the data as soon as possible, and raw. [ref to the blog "Give Us the Data Raw, and Give it to Us Now" http://blog.okfn.org/2007/11/07/give-us-the-data-raw-and-give-it-to-us-now/ ]. Other sites and other applications can often produce hypertext and graphical views of the data. Data such as calendar events, RSS events, bank statements, etc are much more powerfully displayed using multiple client-side views. That said, the ability to dereference a URI in an existing browser and get meaningful results is valuable, and so provision of HTML, if it can be done without undue cost, is valuable. This document describes various ways of doing this." Diagram before 3.1 ** The relationships are a big vague. I think the relationships expressed by the arrows in the diagram are both "description". The two describing documents have different content-types. Maybe change the arrows to read "description", and add "read by semantic web applications" under the RDF and "Read by web browsers" under the HTML. 3.1 Distinguishing between web documents and real-world objects ** This section has major flaws in its argumentation. It says "Above we assumed that there is a distinction between web documents (information resources)andreal-world, non-document objects (non- information resources). The question is where to draw the line between them. " That is, with respect, NOT the question. That is a question is one which has proved unproductive. It is not fruit full to try to define from scratch "Information resource" The question is to distinguish between something and a document about something. That distinction has been introduced already in the document and explained well. Now we have to explain that 200 means "Here is the content of the document you requested" and 303 means "Here is the URI of a document about the thing you requested". When that has been explained, then the class of things which get a 200 will be clear by people understanding the protocol. Later, it says 'The problem now is that web documents are also part of our perceived world, hence they are real-world objects in their own right.". But this is NOT a problem. Once you have thrown out non- information resources" and replaced it with "things". ((For example, mobydick#this may denote a book, and mobydick may denote a library catalog card about the book. Both the book and the card are documents, one is about the other. That is the relationship which is important.)) I propose removing section 3.1 4.1. Hash URIs ** Change "and therefore cannot identify a Web document" to "and therefore does not necessarily identify a Web document" The diagram just before 4.2 ** Remove "303 redirect". I hope that was a typo (copy/pasteo). ** Please add the Content-Location: headers to this diagram. 4.2. 303 URIs ** Change "to a different (information) resource which can be represented as a document and can give you the information that you want." to "to a document which has information *about* the thing you asked about." ** Major technical question about the implementation of 303. I know that dbpedia does it the way described, but there are a lot of good reasons to do it by a 303 to a generic URI for the document, which then itself does a conneg to RDF and HTML. - It is no more round trips than the dbpedia way - It gives the client a URI to bookmark which is generic. This is important: - It allows the user with an RDF-capable client to bookmark the document, and mail it to another user (or another device) which then dereferences it and gets the HTML view. This use of generic resources is important. - It provides the server with the ability to add representation in new languages in the future. - It is standard conneg and so probably more supported on servers Just because client started with the URI of a thing, it doesn't mean that the document involved is not a first class document on the WWW. Best practices for this document apply. One of these is the use of Generic Resources. (See for example http://www.w3.org/DesignIssues/Generic.html and the new ontology ) 4.3 Choosing ... ** I think a whole sentence at least could elaborate that if you use 303 for an ontology, like FOAF, then the network delay can be intolerable for any client looking up a set of terms, even though the client has already loaded everything there is to know. ** The text says: "To address scalability issue with the management of a large set of URIs in case of the 303 solution, the usage of a SPARQL endpoint or comparable services is advised". Why? There is no justification for this. The 303 to an encoded SPARQL endpoint is IMHO clumsy and a proxied normal URI would be better. In future, we may have ways of associating whole URI subtrees with a SPARQL server, but we don't yet. Suggest remove the sentence or expand and explain it. ** The text says: "Note also, that both 303 and Hash can be combined, allowing to spread a large dataset into multiple parts and have an identifier for a non-document resource. An example for a combination of 303 and Hash is: http://www.example.com/bob#thisBob, the person with a combined URI." This is strange. Where is the 303 in this? This (bob#this) is an important way of generating URIs, and deserves a section (insert new 4.3) of its own. For when databases are exposed for example, or other virtual RDF linked data spaces generated from underlying systems. 4.3 ... Conclusion ** In first para, change "grow much" to "grow out of control" or "grow extremely". ** Change "303 URIs should be used for large sets of data that are, or may grow, beyond the point where it is practical to serve all related resources in a single document." to "URIs of the bob#this form can be used for large sets of data that are, or may grow, beyond the point where it is practical to serve all related resources in a single document.</p><p> 303 URIs may also be used for such data sets, making neater-looking URIs, but with an impact on run-time performance and server load." ** Delete the paragraph "If in doubt, it's better to use the more flexible 303 URI approach.". 4.5 Linking ** After the example box, change "This allows RDF-aware clients to find a human-readable version of the resource" to "This allows RDF- aware clients to find a human-readable resource". (The ?x! foaf:page is not at all guaranteed to be an HTML version of ?x! rdfs:isDefinedBy .) ** "authoritative". In what way is the document authoritative? When an ontology defines a term, then the rdfs:isDefined by really means the document gives definitive information from the owner of the term. With alice's company giving data about alice, it is not clear that this is authoritative. I would delete the rdfs:isDefined by unless changing the example. I am not sure though whether the semantics of this are that closely defined. ** Add a paragraph: "The client also can deduce similar link information directly from the HTTP headers: that a thing is described by the document its URI redirects to with a 303s; that the content-location resource is a content-specific version of the generic document, and so on. Ontologies for these relations are not discussed here" (Note the AWWSW group is looking at formalizing that more). ** In the para <<This allows RDF-aware Web clients to discover the RDF information. The approach isrecommended in the RDF/XML specification ([RDFXML], section 9). If the information on the Web page differs significantly from the RDF version, then we recommend using rel="meta" instead ofrel="alternate".>> rewrite: <<This allows RDF-aware Web clients to discover the RDF information. The approach is recommended in the RDF/XML specification ([RDFXML], section 9). If the RDF data is *about* the web page, rather than an expression of the information in it, then we recommend using rel="meta" instead of rel="alternate". >> (I think this distinction is important, and very much in line with the distinctions made throughout the document) 5. Examples from the web Last line of section 5: Change "A better URI would be for examplehttp://ontoworld.org/rdf/Karlsruhe ." to "A better URI would be for examplehttp://ontoworld.org/data/Karlsruhe ." This is a cooler URI as it allows conneg to be introduced to allow the same data to be expressed in rdf/xml or n3 or RIF or whatever we think of next. ________________________________ ENDS
Received on Sunday, 24 February 2008 05:25:50 UTC