comments on CHIPs from Dominique Hazaël-Massieux on 2004-08-23 (www-qa@w3.org from August 2004)

From: Dominique Hazaël-Massieux <dom@w3.org>
Date: Mon, 23 Aug 2004 14:44:24 +0200
To: ot@w3.org
Cc: www-qa@w3.org
Message-Id: <1093265064.4867.139.camel@stratustier>
As alluded in another thread, a few comments on CHIPs [1]:

Globally, in the same vein as Ian's comment on terminology [2],
"reference" should be replaced by "identifies" when relating a URI to
its resource.

"1. Understanding URIs" should start be explaining practically what an
URI is; I suggest referring to "Web addresses", and probably say they
are often called "URLs". Also, the chapter is mostly specific to HTTP
URIs - right in scope with the document :) - and should state this more
clearly.

"A common mistake, responsible for many HTTP implementations problems,
is to think this is equivalent to a filename within a computer system."
Actually, I don't think this is a mistake; a filename within a computer
system is indeed an identifier for the resource (the file); the error
comes from believing that each HTTP URIs maps into a file.

"Thanks to our warehouse metaphor, it is obvious that URIs..."
This reads awkward, I think; it looks like you're trying to associate
the reader to a deduction he was looking for (as in a mathematical
demonstration), where you've in fact simply chosen a metaphor that
conveys what you were trying to demonstrate; I suggest instead:
"With our warehouse metaphor, one can easily understand why URIs..."

"this means that the resource would miss some traffic... Traffic being
the final aim of any content provider ..."
No need to restrict your point to content provider; for instance, for a
Web service (lowercase s, e.g. a Web shop), broken links generate
dissatisfaction for the client, reduces the usability of the system,
etc.

"Guideline 3: Use independent URIs"
"URIs should be both stable and independent. By independent we mean that
a URI should always reference the same resource, regardless of the
context (time, location, user, user-agent, etc.)"

I think "stable" better conveys the idea than independent, at least when
it's not qualified (ie, "technology-independent" sounds good").

"Standard identification mechanisms for the World Wide Web" (in 3.2)
that's really standard identification mechanisms for the HTTP protocol

"For the sake of semantics and caching (...)." (in 4.2): there is no
verb in this sentence
I suggest stressing the caching aspect of the question earlier in the
checkpoint, and also mentioning that this is the 410 error page is a
perfect place to explain why the page was removed.

in 5.2, "it gives agents information about the actual (current) location
of the resource currently served (as opposed to the generic location
used to access the resource)."
I think there is a tricky question hidden here; how does
Content-Location relates to the Resource/Representation distinction? It
relates to a comment I made on WebArch, BTW [3].

in 5.3, "the integrity of the transported entity. and" -> dot instead of
comma
also, "However he md5 sum" -> "the md5 sum"

I'm not sure why Guideline 6 is under part 1 'URIs' rather than part 2
'serving content appropriately'

in 6.3, "This is a harmful lie for caching engines and should be
avoided." 
I suggest mentioning that setting the proper caching information can
help reduce the bandwidth and in the case of dynamically generated
content, help reduce the CPU needed to serve the content.
I have a technique for this one, BTW; cf
http://www.mnot.net/cache_docs/#IMP-SCRIPT

in 7.1, "often called "content-negotiation" erroneously" ; this is
awkward, since that's the term used in the HTTP specification itself; I
don't think it makes sense to use the term "format negotiation"; what
is/was the rationale?

"configure content-type negotiation": I suggest "Media Types
negotiation" 

"thus they are supposed to support every and any content type, which
they certainly do not": actually, what is wrong is not to say that
support every content type - which in the end is acceptable since they
can at least download any type of content -, the problem is that they
claim to support any content type at the highest level possible.

in 7.2, "If the resource is served using language-negotiation (actually,
even if it is not)," I think striking the "if" part, then.

GL 10, """
Example of a wrong practice:
CSS style sheets are sometimes served as plain text (text/plain media
type), causing the user-agents to ignore the style sheet and rendering
the document in an unexpected manner.
Example of a proper practice:
CSS style sheets should be served with the text/css media type.
"""
This reads awkward, esp. the 1st example where it's unclear whether the
bad practice is from the server or the clients - of course it's the
server, but the formulation doesn't make that easy to read.

Dom

1. http://www.w3.org/TR/2003/NOTE-chips-20030128/
2. http://lists.w3.org/Archives/Public/www-qa/2004Aug/0018.html
3.
http://lists.w3.org/Archives/Public/public-webarch-comments/2004JulSep/0034.html
-- 
Dominique Hazaël-Massieux - http://www.w3.org/People/Dom/
W3C/ERCIM
mailto:dom@w3.org
Received on Monday, 23 August 2004 12:44:26 UTC