Some comments on WD-webarch-20020830

> 3.Protocols. A small and nonexclusive set of protocol specifications
> for interchanging information between agents, including HTTP

In which way the protocol set is "small" ? I think it's better to
avoid those relative qualifiers. From where I sit, I have the
impression that there are a lot of protocols sitting on IP.

More importantly: I don't really like the split in 3 topics presented
here: identifiers, formats and protocols. I think there are only 2
things: identifiers and formats, with some formats intended for
presentation to human and some not (there are always intended for
agent/machine, so no need to mention it). 

OK, it's still 3 :-) but with a different split semantics. In fact,
what I'm saying is that I don't see any difference between the two
definitions provided here:
 | 2. Formats ...  for interchange between agents in the system.
 | 3. Protocols... for interchanging information between agents

I think the split presented in the document is only a reflexion of
what we're seeing today. The fact that HTTP is not using the XML
syntax for expressing its requests/responses is an historical
artefact, and not architecturally relevant I think. 

> 1.2. Audience of this document

Is missing a paragraph on prerequisites, i.e. what the audience should
know about Web technologies before reading webarch.

Related to that: I don't think it's consistent to mix tutorial-like
sections (like the URI page in the intro to 2) and statements like
"sharing the same fragment identifier semantics". People that
understand what this means obviously do not need to be explained by
example what is a URI.

> This document does not address architectural design goals covered by
> targeted W3C specifications:
> 
>   1. Internationalization; see W3C's Internationalization Activity.
>   2. Accessibility; see W3C's Web Accessibility Initiative.
>   3. Device independence; see W3C's Device Independence Activity.
 
I hope the Formats section is at least going to refer to I18N, WAI, etc.
I suggest a wording like : 
 | This document does not address +but refer+ to architectural design
 | goals covered by targeted W3C specifications:

> 1.4. Summary of principles
> 
> In the design of the Web, some design decisions, like the names the <p> and
> <li> elements in HTML, or the choice of the colon character in URIs, are
> somewhat arbitrary; if <par>, <elt>, or * had been chosen instead, the
> large-scale result would, most likely, have been the same. Other design
> choices are more critical; these are the architectural principles of the
> Web:

I suggest mentioning that all those principles (in the summary) are
only for Identifiers for now, maybe by providing empty section for the
two others.

> 2. Absolute URI references are unambiguous:
>      Each absolute URI reference unambiguously identifies one resource.
> 3. Describe resources:
>      Owners of important resources (for example, Internet protocol
>      parameters) SHOULD make available representations that describe the
>      nature and purpose of those resources.

Would be good to have term links for things like representation,
absolute URI, etc.

>    * A Uniform Resource Identifier, or URI, is a character sequence starting
>      with a scheme name, followed by a number of scheme-specific fields.
>    * An absolute URI reference is a URI followed optionally by a fragment
>      identifier.

Is an absolute URI the opposite of a relative URI ?
I think providing example of what is not OK to do would help a lot.

> are serendipitous (e.g., global search services). See the TAG finding URIs,

I'm sure 90% of your non-native english audience will need to look in
a dictionary to find out what serendipitous means. Use a simpler word.

> Issue: URIEquivalence-15: When are two URI variants considered equivalent?

I suggest moving all the issues in the issue list or all here but a
mix is bad (sounds that there are only 3 issues left).

> 2.2.2. Interactions with resources

> by metadata, usually based on [RFC2046].handling of fragment identifiers.

Mention MIME by name, for clarity.

> and entirely governs the handling of fragment identifiers.

Entirely ? Are you sure there is no other parameters/methods that are
used in handling URI with a fragment ? (like the language, the
charset)

> Representation retrieval is safe: Agents do not incur obligations by
> retrieving a representation.

Could use more details on the meaning of safe and obligation in this
context. 
 
> Editor's note: Need to say something about difference between assertions

Should be in issue list.

> Editor's note: Need to clarify what "equivalent" means in the previous
> sentence.

Suggest that the clarification be based on "same understanding",
e.g. in XAG.
 
> 2.3. Persistence

> Support persistence: Those who create and manage resources and their
> identifiers SHOULD design the identifiers in such a way as to ensure their
> persistence.

Is it ok to define a limited persistence ?  

Asked differently: should the Web be better than Life itself ?
(ok, it's Web philosophy, not Web arch :)

Everything is not forever in real life, for instance, painting
exhibits run for a while and then disappear.

What's the difference for a hardline Dali's fan between "this web site
presents various paintings all year long (e.g. is persistent) and Dali
is just on May 17th" and "this page presents Dali's painting all the
time and will only be up one day (not persistent) on May17th" ?

> For more discussion about persistence, refer to [Cool].2

Mention what Cool is (TBL paper) inline.
 
> 2.4. URI Schemes
> Correct processing of URIs is often scheme-dependent, 

Besides basic string manipulation (e.g. sorting in an history list),
when is processing of URI not scheme depedent ? "often" sounds a bit
weak.

> that URI in any Internet protocols; there aren't any valid uses of it. You

"aren't any" -> are no

> Do not use unregistered URI schemes: Unregistered URI schemes MUST NOT be
> used on the public Internet.
> 
> The IANA registry [IANASchemes] lists URI schemes and the specifications
> that define them. For instance, the HTTP URI scheme is defined in section
> 3.2.2 of the HTTP specification [RFC2616]. 

As written, it sounds like it's OK for a spec to define its URI
locally and be done with it. Changing "For instance" by "In addition"
would be better.

> with a fragment identifier. The fragment identifier is interpreted only
> after the retrieval of a representation. Section 4.1 of [RFC2396] states

If the tutorial flavor is conserved, I suggest adding an example for
this part, explaining how fragment are used on the client side only,
and not on the server (if that is really the case).

> 2.6. Some generalities about absolute URI references

At some point in this section, a diagram would help a lot:
Something that illustrates the various matching possible:

URI1                           Representation1

URI2     =>  One resource  =>  Representation2

URIn                           RepresentationN(time)

>   2. It is not possible to inspect an absolute URI reference and determine
>      what resource it identifies. For example, in general, one cannot look
>      at http://www.example.com/lj45sr and know that it refers to "my old
>      car" or "the weather forecast for Oaxaca."

Is that a definitive statement of just the current state of the Web ?

Suppose someone comes up with an ontology for http URI top level
"dir", like /People, /Talks, /TR, etc, and this is published, shared
by a community, much like a Semantic Web. Would that be bad ?
 
>   6. It is not possible to inspect an absolute URI reference and know the
>      media type of representation(s) of that resource. For example, do not
>      assume that an absolute URI reference that ends with the string ".html"
>      refers to a resource that has an HTML representation. Of course,
>      resource owners should not publish absolute URI references likely to
>      cause confusion.

Should there be a statement to the effect of not using .htm or .png
for http URI ? (since it is useless)

Received on Monday, 30 September 2002 04:47:59 UTC