Comments on Web Arch WD - 2004-07-05 from Karl Dubost on 2004-09-02 (public-webarch-comments@w3.org from September 2004)

From: Karl Dubost <karl@w3.org>
Date: Thu, 2 Sep 2004 19:20:50 -0400
To: public-webarch-comments@w3.org
Message-Id: <BA793FF3-FD36-11D8-9ACF-000A95718F82@w3.org>
Disclaimer: This is not an official a review from the QA WG. Just a  
simple review to be taken into account for comments.

This review is based on:
Architecture of the World Wide Web, First Edition
W3C Working Draft 5 July 2004
http://www.w3.org/TR/2004/WD-webarch-20040705/

General comments:
Promoting: You should send/promote this document to every computing  
school in the world and make it almost an absolute reading of all new  
participants to W3C. Working with the communication teams, would be  
good. Maybe a book version of it would be more than welcome.

At the start of the document, you begin with a story which seems  
interesting. I'm not sure it's feasible but we rapidly loose tracks in  
your document. All the content is there but with a kind of  
discontinuity, not articulated enough. If you want to make your story  
really enjoyable and educative, it has to be a story all along the  
document, like a kind of journey through the architecture of the Web.

Editorial:

* In the paragraph defining principles, you have put into double  
quotes, the comma. I think you should put them outside.
	 "self-descriptive syntax," "visible semantics,"
* Not well known to and not often respected, quoting in english is done  
by “text” and not "text".
* you might want to give references to these two “laws”: Metcalfe's Law  
and Amdahl's Law.





Issues:

* KD-001 (picky detail)
The graphics identified by the URI [1] gives an XML file which is not  
determined. Is it XHTML 1.0, XHTML 1.1, etc? There's no dtd. It's  
delivered with application/xhtml+xml, but there's nothing which says  
what is an XHTML file. I know what's an XHTML 1.0 or an XHTML 1.1.
It could be an XHTML Family document, but it's unclear if it respects  
the conformance criteria [2]

[1]	http://www.w3.org/TR/2004/WD-webarch-20040705/uri-res-rep.png
[2]  
http://www.w3.org/TR/2001/REC-xhtml-modularization-20010410/ 
conformance.html#s_conform_document

* KD 002
Global identification. This is a very important point which is a lot  
wider than identification in terms of URI but relies on the social  
benefits of shared decision (mostly by consensus). The Web can work  
because there's one XHTML not because there are two competing  
solutions. Trying to always bring the competing solutions in one and  
unique forum is better than having to fight outside on implementation  
taking the users in hostage.

* KD 003
"""When a URI alias does become common currency, the URI owner should  
use protocol techniques such as server-side redirects to relate the two  
resources. The community benefits when the URI owner supports both the  
"official" URI and the alias."""
As I agree completely with that, it's unfortunately a waterfall issue.  
Most of the server software, which are managing URIs, don't give an  
easy way to share and manage URIs. It's why CHIPs has been written.  
Maybe you should give a link to that W3C Team Note to help developers  
to implement correctly and in a usable way the management of URIs.

* KD 004
2.4 URI Overloading
Your example is not necessary clear for anyone. In the sense that you  
could have a page describing the movie, and in the same Web page having  
a forum which talk about the movie. It's the case for example in the  
site for the documentation of php which describes the features of the  
language with a forum into it. I understand that URI can be used in  
another context to identify things, but it's not obvious for someone  
who's reading your document and as always thought about URI as  
something that gives you a Web page. Maybe you have to refine the  
example or make the context clearer.

* KD 005
2.5 URI ownership
"""One consequence of this approach is the Web's heavy reliance on the  
central DNS registry."""
That's short for something which is one of the major issue of the Web.  
The whole Web relies on something which is dependent on a rented  
property notion.
	- You own a domain name only for a portion of time
	- You don't own a domain name for ever.
	- A domain name has a cost which makes it inaccessible for many  
persons in the world.
	====> Consequences: URIs are not free!!!! and so not all people can  
use them and guarantee the ownership.
In fact, there's no such thing as URI ownership, but more "URI renting"  
or "URI tenant" for URIs based on domain names.

* KD 006
2.8. Fragment Identifiers
an URI + a fragment identifier a URI-reference by definition. Using a  
term which is generic for two meanings might lead to misunderstanding  
as exactly in the case of "using two different URIs for the same  
representation".

* KD 007
2.9.2. Assertion that Two URIs Identify the Same Resource
What does it implies? What does it mean? What are the benefits? In  
which usage scenario? are questions that I want to know when I'm  
reading it. Or at least a pointer to a resource explaining with the  
same level of clarity that this document usually does.

* KD 008
3. Interaction
"""Communication between agents over a network about resources involves  
URIs, messages, and data."""
This sentence, and in fact the paragraph, is a bit obscure. Maybe  
something on the line of:
	When two agents (piece of software) communicates
	about a resource, they exchange a message which
	includes data and which is identified by an URI.
Metadatas are data. It's just a kind of data, and data are always  
metadata of something else. I'm not sure to explicit the three levels  
(data <- metadata <- metadata) you have given is useful or if you do,  
explain in more detailed context.

* KD 009
3.2.1. Details of Retrieving a Representation
I have the feeling that the localization of the resource has been  
forgotten before to do an HTTP GET you have to resolve the domain name.

* KD 010
"""Note also that the choice and expressive power of a format can  
affect how precisely a representation provider communicates resource  
state. The use of natural language to communicate information may lead  
to ambiguity about what the associated resource is, which in turn can  
lead to URI overloading."""

What do you mean? I can't figure it out. Are you talking about the  
choice of format when you display information. For example using an  
image to represent text content. or are you talking about communicating  
URIs on printed media.

* KD 011
3.4. Inconsistencies between Representation Data and Metadata
"""On the other hand, there is no inconsistency in serving
HTML content with the media type "text/plain", for example,
as this combination is licensed by specification"""
You could even say that it could be done on purpose, when for example  
someone wants to display the source of an HTML file for example. User  
agents must not contradict the media-type which has been sent.

* KD 012
3.6.1. URI Persistence
"""a URI should continue indefinitely to refer to that resource."""
That is not possible, because domain names are not defined and owned  
for life. There are many social issues which are definitely harmful for  
this part the World Wide Web Architecture. Asking for URI persistence  
without solving the domain name issue is like asking people to go  
university when they can own the price for it. See my issue KD 005.
	Another problem with this motto. The "URI owner", owner can be legal  
entity or a person.
	If a legal entity (organization, company, etc) what's happening when  
the legal entity disappears, what the URIs which relies on domain names  
are supposed to become.
	If a person, and this person dies (natural death or not), what the  
URIs are supposed to become.

"Indefinitely" is just impossible. It's a completely false assertions,  
except if the system is organized differently.

* KD 013
4.2.3. Extensibility
""" Good practice:  Extensibility mechanisms
  A specification SHOULD provide mechanisms that allow any party to  
create extensions that do not interfere with conformance to the  
original specification."""

This Good Practice is too general. Extensibility MUST NOT be a SHOULD.  
Extensibility is a very delicate topic which has to be considered  
carefully by a group designing a format. It CAN be absolutely wise to  
forbid extension.
Choosing extensibility leads to benefits and drawbacks. See for this  
topic
	http://www.w3.org/TR/qaframe-spec/
	http://www.w3.org/TR/spec-variability/
	
	1. If extensions is considered as beneficial, the specification MUST  
provide a mechanism to do so.
	2. If such a mecanism is given, it MUST not interfere with the  
conformance of the section

I would add a link from this section to the QA Framework Specification  
Guidelines and to the Variability in Specifications document.
	http://www.w3.org/TR/qaframe-spec/
	http://www.w3.org/TR/spec-variability/

*KD 014
4.5.7. Media Types for XML
""" In general, a representation provider SHOULD NOT assign Internet  
media types beginning with "text/" to XML representations."""
Hmmmm.... I'm not sure. I understand. But for example if you want to  
display the source code of a XHTML file with text/plain, it's perfectly  
valid and a useful case.

* KD 015
4.5.7. Media Types for XML
"""Second, representations whose Internet media types begin with  
"text/" are required, unless the charset parameter is specified, to be  
considered to be encoded in US-ASCII."""
Is it defined somewhere? Because most of the non english speaker will  
have other kind of encodings in their text-only files.

* KD 016
5.1. Orthogonal Specifications
"""the software developer community would benefit from being able to  
find all HTTP headers from the HTTP specification (including any  
associated extension registries and specification updates per IETF  
process). Perhaps as a result, this feature of the HTML specification  
is not widely deployed. """

Not true. Use case. I'm a technical writer, I'm explaining how to  
create an HTML file, foo.html, I give a link to the html representation  
of foo.html and therefore served as text/html. Now I want to explain  
the source code, and I would like to use the benefits of the object  
element to display the source code of the same file. So I set in my  
object element the text/plain mime type.
Though because of precedences rules of HTTP over HTML, the only way to  
do is to not specify on the server side the mime type but only in the  
meta of the HTML file. So that once it can be displayed as an HTML file  
or it can be displayed as a text file.

In fact right now, the only possibility to do it without problems is to  
create
	foo.html and foo.html.txt
which is quite stupid.

-- 
Karl Dubost - http://www.w3.org/People/karl/
W3C Conformance Manager
*** Be Strict To Be Cool ***
Received on Thursday, 2 September 2004 23:20:51 UTC