Re: Architecture document comments from Ian B. Jacobs on 2003-04-23 (www-tag@w3.org from April 2003)

From: Ian B. Jacobs <ij@w3.org>
Date: 23 Apr 2003 18:55:47 -0400
To: Graham Klyne <GK@ninebynine.org>
Cc: www-tag@w3.org
Message-Id: <1051138547.1832.58.camel@seabright>
On Tue, 2003-04-08 at 10:45, Graham Klyne wrote:
> Reviewing:
>    http://www.w3.org/TR/2003/WD-webarch-20030326/
> 
> I see this taking some substantial shape and content, and look forward to 
> seeing it become a respected point of reference.

Thank you, Graham. We appreciate the comments; they help us 
move forward.

> Here are a few small points I noticed...
> 
> ...
> 
> Section 3, para 1:
> 
> I felt the reference to "semantic rules of URIs" could be confusing.  Maybe 
> here it would be sufficient to say simply: "... agreement to follow the 
> rules of URIs ..."
> 

Yes, that seems reasonable and simpler.

> ...
> 
> Section 3, para 2:
> 
> I think it's great that this document attempts a description of what 
> constitutes a "resource".  But I feel the phrase "all those things that 
> populate the web" could be read as being in conflict with the examples that 
> follow, and also the text later in section 3.3.  My concern hinges around 
> the word "populate, which can be taken to suggest something that is 
> actually part of the web, to the exclusion things that are described by the 
> web.
> 
> My suggestion:  "... all those things that are represented on the Web: ..."

I think that the definition of "resource" in this
document is still under consideration. I will note your proposal.

> Section 3.1, para 3:
> 
> typo in OWL10 reference?  (double [[ and ]])

Ok.

> ...
> 
> Section 3.1, para 5 (following Good practice note):
> 
> I felt that this paragraph (...ensuring sufficient difference...) would 
> also justify a "Good practice" point.

The first half of the sentence is close to the preceding good practice
comment. The latter half (make URIs that identify different resources
look pretty different) might be worth a good practice note, but might
be just part of more general good practice:

  * Some glyphs look like each other ("O" and "0") and may cause
    confusing to people reading those glyphs.

  * URIs that look similar in unpredictable ways but 
    identify different resources may be confusing. 

    But consider:
         http://www.example.com/a/b/c/d/1.html
         http://www.example.com/a/b/c/d/2.html

    That seems reasonable to me.

  * Short URIs are generally more convenient.

  * Other usability considerations?

> Section 3.3, para 1:
> 
> I was rather uncomfortable with the introductory phrase "In some URI 
> schemes...", in that it sets an expectation that fragment identifiers have 
> a direct dependency on the URI scheme used.  Later text makes it clearer 
> that any such dependency is indirect.  (I think I can imagine some software 
> scenarios in which "mailto:nobody@example.org#abc" *does* have a practical 
> meaning, given that the resource identified as "mailto:nobody@example.org" 
> is a mailbox, the default "view" of which is initiate composition of a 
> message to be sent to it.)
> 
> So I suggest:  "It may be meaningful for a URI to end with a fragment 
> identifier ...".

How about: "Some URIs end with a fragment identifier..."

> Also, in para 2, I suggest:
> 
> "Note that while this composition is syntactically fully general, it is 
> less likely to be useful with some URI schemes.  The URI 
> mailto:nobody@example.org#abc does not have a well-understood meaning in 
> practice."

Ok.

> 
> Section 3.3, para 1:
> 
> This paragraph seems to focus on the idea that a fragment is part of the 
> thing identified by the fragment-free URI.  Three paragraphs later, it is 
> clearer that being "part of" is not a requirement.  I am thinking about a 
> possible form of words that makes this clearer.  Saying "view" is better, 
> but it still seems to suggest a physical sameness of resource and 
> fragment.  "projection"?
> 
> E.g.
> "... to yield an identifier for a projection of (e.g. a part of, or a view 
> of) a resource."

What about saying something like:
 
 1) Syntactically, some URIs end with frag ids.
 2) Though syntactically general, for some URIs schemes, not useful.
 3) Often, the frag id is used to identify part of, or a view of,
    a resource. However, there is no requirement that a URI with 
    a frag id identify a part of, or view of a "greater" resource.
 4) Format and interpretation of frag ids depends on representation 
    type.

> Section 3.4, paras 1,2:
> 
> There seems to be a conflict between the first two paragraphs.  The first 
> seems to say that dereferencing is based on the URI ("beginning with ... 
> the scheme of the URI");  the second says that the dereference mechanism is 
> dependent on the context.  I'm not sure what the actual intent of this text 
> is meant to be, so I can't really offer an alternative.

Given an HTTP URI, one could choose to use GET or HEAD (for example)
depending on the application context. In either case, the HTTP spec
governs how this is done. But the URI alone does not determine whether
GET or HEAD is chosen. The context may be simple choice of the
application, suggested by a markup language, etc.

> In thinking about this, it seems to me that there are some legitimate 
> scenarios that are not clarified by this; e.g.
> 
> 
> (1) using HTTP-over-IPv4 vs HTTP-over-IPv6.  The choice here is determined 
> by both the context (what IP versions the client supports), indirectly the 
> URI (what IP versions supported by the origin server or availoable proxies) 
> and the URI (does the URI contain an IPv6 or IPv4 literal).
> 
> 
> (2) I have heard it suggested that a new hypertext access protocol might be 
> deployed, and used for dereferencing http: URIs.  This is implicit in the 
> argument that identifiers rooted in http: URI space are OK to use as more 
> general identifiers.  E.g. see [1].
> 
> [1] http://lists.w3.org/Archives/Public/www-tag/2002Feb/0118.html
> and replies:
>      http://lists.w3.org/Archives/Public/www-tag/2002Feb/0179.html
>      http://lists.w3.org/Archives/Public/www-tag/2002Feb/0180.html
>      http://lists.w3.org/Archives/Public/www-tag/2002Feb/0182.html
> 
> If an when a bigger, better hypertext protocol is deployed, and if it is 
> used to dereference http: URIs, to what extent is the context and to what 
> extent is the URI determining the dereferencing mechanism?
> 
> 
> (3) Within a given environment (e.g. cellphones?), some different protocol 
> can be used to dereference http: URIs (which may or may not eventually 
> invoke a "native" http: transaction.
> 
> 
> I suspect the best that can be said here is that it is a combination of the 
> URI and context of use that determines the dereferencing mechanism employed.

I was not thinking of these scenarios, in particular (3). I would like
to learn more about the use of another protocol than HTTP for
dereferencing an HTTP URI (and similar for other URI schemes).

> ...
> 
> Section 3.4.1:
> 
> The relationship between this section and 3.4 is not entirely clear.  I am 
> guessing that retrieving a representation is presented as a particular kind 
> of dereferencing?

Yes. That characterization may not survive, but that's the intent here.

> ...
> 
> Section 3.4.1, Principle:
> 
> This principle seems to be so important, so fundamental, that it seems 
> all-at-sea buried this deep in a relatively arcane discussion.  I think it 
> belongs right up at the start of section 3, as part of the lead-in to 
> identification and resources.

I need to think about a good place to put this.

> 
> ...
> 
> Section 3.4.3:
> 
> I liked this.

Whohoo!

> 
> ...
> 
> Section 3.4.4:
> 
> I did wonder in passing if this section was about identification or 
> interaction.
> 
> ...
> 
> Section 3.4.4, para 7:
> 
> In:
> [[
> Persistence is always a matter of policy and commitment on the part of 
> authorities assigning URIs rather than a constraint imposed by 
> technological means.
> ]]
> 
> the reference to "assigning URIs" seemed open to misinterpretation (e.g. as 
> an ISP handing out blocks of URI space for its customers' use?).  Given the 
> terminology introduced earlier, maybe say:
> 
> [[
> Persistence is always a matter of policy and commitment on the part of 
> authorities servicing URIs rather than a constraint imposed by 
> technological means.
> ]]

Yes.

> ...
> 
> Section 4.3.3:

> The question noted here seems to be answered in section 3.3.
> 
> ...
> 
> Section 4.4, para 2:
> 
> Typo?  Missing comma (,) between "concepts" and "content"?
> 
> ...
> 
> Section 5.2, Good practice note:
> 
> Typo:  "uniform address spac"
> 
> ...
> 
> Section 8:
> 
> It's not clear to me what it means for a reference to be normative vs 
> non-normative in this document.

I agree that should be clarified. Here's one rough interpretation: When
this becomes a W3C Recommendation, it may serve as the basis of policy
within W3C that other specs must conform to the normative statements of
the Arch Document (e.g., all of those MUST statements that are relevant
to technical specifications). Statements that concern authors, or users,
or producers of URIs, or other parties may not be usefully normative.

 - Ian

> -- 
> Ian Jacobs (ij@w3.org)   http://www.w3.org/People/Jacobs
> Tel:                     +1 718 260-9447
Received on Wednesday, 23 April 2003 18:55:52 UTC