Review of "Cool URIs for the Semantic Web"

Leo et. al.,

Firstly, although there are a number of both substantive and editorial
comments below on your document [1], it is an excellent document. The
clear diagrams and the example section toward the end of the document
are particularly useful.

I believe that the document is broadly inline with the intention of the
TAG's resolution on httpRange-14 [2]. Though the mechanisms described
combining content-negotiation and redirection do go beyond the advice
given by the TAG resolution.

In one place, the document does state:

	 "For the redirect to the RDF, 302 Found, 303 See other and 
	307 Temporary Redirect are all fine;...". 

This is *not* a statement endorsed by the TAG. The TAG has started to
discuss the significance of other redirect codes, but so-far has only
advised the use of 303 See other.

You may also be aware that the TAG has its own draft finding on this
topic [3]. We hope that our work will complement your own in terms of
giving clear practical guidance arising from [2].

More detailed review comments below.


Stuart Williams
--
[1] http://www.dfki.uni-kl.de/~sauermann/2006/11/cooluris/
[2] http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html
[3] http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14
--
Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks
RG12 1HN Registered No: 690597 England
--

Review of HTML version of 
	"Cool URIs for the Semantic Web"
	http://www.dfki.uni-kl.de/~sauermann/2006/11/cooluris/

Stuart Williams for W3C TAG
June 2007

Agreement
=========
I think that the TAG would agree with the general advice give in the two
boxed statements in the document:

	"Be on the web"; 

	"Don't be ambigious"

wrt "Be on the web": "Given only a URI, machines and people should be
able to retrieve a description about this URI from the web. ..."  This
is a little too loose, in that the description is not about the URI but
about the resource to which the URI refers. Also, the retrieval result
is not always a *description* of the referent. It can be a
webarch:Representation of the referent's current state. However, the
general priniciple applies, that you would expect to retrieve something
of relevance pertaining to the resource by a retrieval operation using
its URI.

Substantive:
============

2 URIs for Web Documents

4th para last sentence: "... a URL simply identifies whatever we see
when we type it into a browser."

Sadly not! On a 200 response the URL (Content-location: <URL>)
'identifies' the webarch:Resource that provided the
webarch:Representation that is rendered on by the browser. In general,
neither the rendering nor the webarch:Representation from which it was
rendered have an 'identifying' URL.

--

2.1 HTTP and Content Negotiation

4th para: "The server could answer... followed by the content of the
HTML document in English".

It is not clear that the document is intrinically HTML, rather than what
is returned is an HTML renderding of the document content in English (as
opposed to a PDF rendering of the content in French). I suggest you be
more precise that what follows is "...an HTML rendering of the document
content in English." or wteo.

Could also reference the TAG finding on generic resources:
http://www.w3.org/2001/tag/doc/alternatives-discovery.html

--

3 URIs for Real-World Objects

1st para: "On the semantic web, URIs identify not just web documents,
but also..." 

This statement is true of the traditional web as well. ie. as written,
it is presented as being true of just the semantic web.

--

~8th para: 2nd sentence: "If we can't use document URLs as resource
identifiers...". 

Documents are resources and their URLs are resource identifiers for
those documents. 

--

4.1 303 URIs

1st para: "....to distinguish between non-document resources from
regular web documents."

We sort of trip up on the document word here because there are things
which are non-document resources that *are* webarch:InformationResources
- eg. a web controlled robot arm is not a document, however it may be
controlled through the exchange of webarch:Representations.

--

4.2 Hash URIs

~6th para: "...(otherwise a client could conclude that the hash URI
*represents* a part of the HTML document). Replace "represents" with
"refers to".

--

4.2 Hash URIs

~6th para: "For redirect to RDF,  302 Found, 303 See Other and 307
Temporary Redirect are fine..."

The TAG does not endorse this statement. The TAG's advice is to to use
303 redirects. It is evident from the record of our meeting in May/June
2007 [a] that we have had some discussion of the significance of other
3xx redirections. However, at present statement above is not a position
that the TAG endorses.

[a] http://www.w3.org/2001/tag/2007/05/30-minutes#item04

--

5 Examples from the Web

Semantic MediaWiki

Whilst one cannot argue with whether or not this is how the Semantic
MediaWiki provides URI for its RDF descriptions, URI of the form:
http://ontoworld.org/index.php/Special:ExportRDF/Karlsruhe?xmlmime=rdf
look terrible - expose underlying techology (PHP); query hack seem to
relate to media type... why on earth not something as simple as:

	http://ontoworld.org/wiki/<topic> ->
http://ontoworld.org/rdf/<topic>

or follow the Soton pattern which is really well thought out.

Personnally I'd drop the Semantic MediaWiki example from the document
until they generate cleaner, less obtuse URI. Particularly in the light
of "There are efforts underway..."

--

Editorial:
=========
1. Introduction: 

2nd para:
You state that information has to be expressed as statements about
resources and then proceed to give a number of examples all of which are
phrased as questions rather than statements. eg. "who are the members of
a company". I think these would be better turned around as statements of
fact: eg: "Leo Sauerman is affiliated to DFKI".

--

3rd para: 2nd sentence: "Confusingly, URIs and URLs share the same
syntax..." Personnally I don't find that confusing at all - and I am
infact more confused by that remark itself. I suggest either dropping
the sentence entirely, or simply remarking that URI and URLs share the
same generic syntax - that all URLs are URIs but that not all URIs are
URLs.

--

4th para: http://www.acme.com is a real website. Tradition has been to
avoid the use of deployed URI/DNS in specs/tutorials and the like -
unless they are directly entailed in the spec/tutorial in some way.
Typically, although perhaps boringly, URIs based on "example.org" or
"example.com" are used. Those DNS names are reserved for the purpose of
providing examples - if you pressed I suspect I could find chapter and
verse on what they are reserved for. Anyway, you probably shouldn't use
acme.com as the domain in your examples.

--

2: URIs for Web Documents

(Picky comment) 3rd para: "Like everything on the traditional web, these
are <i>web documents</i>." Follows a pairwise listing of URIs and the
things (homepages) they are said to designate (or identify or
refer-to...). In the quoted sentence the word "these" needs could bind
to either a URI or a homepage or a pair. I suggest you state what it is
that are "web documents" more clearly - eg. 

"Like everything on the traditional web, each of the homepages mentioned
above are <i>web documents</i>.

--

I have a mild preference that you avoid the term "Web Document". My main
aversion to the word document is it sometime has a webarch:Resource
sense and sometimes a webarch:Representation sense - and as a
consequence it is not always clear which sense is in use and indeed both
may be intended in different parts of the same sentence. OTOH, for an
educational document perhaps it is ok to be a little looser - but given
the topic area I am not so sure that we can dispense with the precision.


Another rational is that the webarch:Representations returned by some
webarch:Resources in response to an http GET operation may be
programmatically generated - they are more like a UI to a program than
something that would be thought of as a document - and that is on the
traditional web. 

Basically, I'm suggesting that you avoid word/terms that could
interpreted as referring to either a webarch:Resource or a
webarch:Representation. Unfortunately, both "document" and "webpage"
suffer this kind of problem.

--

2.1 HTTP and Content Negotiation

1st para: 1st sentence: Replace "Today's web clients and servers use the
HTTP protocol to request web documents..." with "... to request
(webarch:)Representations of web documents..."

2nd para: 1st sentence: Replace "When a user agent (e.g. a browser)
requests a URL..." with "When a user agent (e.g. a browser) make an HTTP
request..." - the browser is not requesting a URL, it has that already.

--

4.1 303 URIs

1st para: "This practice has been embraced by the W3C Technical
Architecture Group in it's <i>httpRange-14 ruling</i>."

What was published was as statement that the TAG resolved provide some
specific advice to the community. Please refer to this as the TAG's
<i>httpRange-14 resolution</i>.

--

4.2 Hash URIs

"data" in http://www.acme.com/data" doesn't really appeal to the
intuition that the example URIs give are identifiers for real-world
things (a corporation and two particular people) as opposed to data
about them.

--

4.3 Choosing between 303 and Hash

"The has URIs have the advantage of reducing the number of necessary
HTTP requests." Suggest replacing "HTTP requests" with "HTTP
round-trips, which inturn reduces access latency".

--

6.1 New URI schemes

1st para: "HTTP URIs were originally designed to identify web
documents..." 

Speaking well after the fact about the original intent of the design of
some artifact (HTTP URIs) generally creates a hostage to fortune if not
substantiated by a contemporary record. The statement may indeed be
true, but I would not be so bold as to make it with a means to reference
an expression of that original intent.

Received on Monday, 18 June 2007 16:10:09 UTC