Re: Input on Fielding's -02 draft from Roy T. Fielding on 2003-06-04 (uri@w3.org from June 2003)

From: Roy T. Fielding <fielding@apache.org>
Date: Wed, 4 Jun 2003 16:49:51 -0700
To: Tim Bray <tbray@textuality.com>
Cc: uri@w3.org
Message-Id: <3BFCCBF5-96E7-11D7-84EE-000393753936@apache.org>
Hi Tim,

Thanks once again for your careful reading of the spec.

On Sunday, June 1, 2003, at 05:33  PM, Tim Bray wrote:
> 1.2.2, 2nd para, last sentence.  There is a grammatical awkwardness 
> Using ...  is termed a "dereference" of...
> The gerund cries out for another; I suggest something like "To use .. 
> is to "dereference"  " or "Using ... is termed "dereferencing" or "the 
> term 'dereference' is used to describe the action using..."

I decided to go with "Use of ..."

> 2.1 last sentencee
>
> Recommend inserting a phrase as follows: "...escaping any octets, and 
> only those octets, that are not in the unreserved..."

now "escaping only those octets ..."

> 2.2 2nd para, "URI" is used in the plural

I think somebody else already caught that one.

> 2.3, first word.  Why do we need to talk about "Data" characters... 
> how are they distinguished.  Just begin "Characters that are 
> allowed..."

Okay (they are characters not being used as delimiters).

> 2.3, 1st para after BNF block.  "Unreserved characters can be escaped 
> without changing the semantics of a URI".  This is at best highly 
> misleading in the case of URIs used as XML namespace names, whose only 
> semantic is identification and comparison, and where comparison is 
> typically done using strcmp(), and thus escaping an 'a' character will 
> indeed change its semantics.

That isn't a semantic.  If XML namespaces treats them as strings, then
the attributes are strings and not URI.  I cannot stop such 
specifications
from assuming false semantics for the sake of parsing efficiency, but
they cannot change the meaning of two equivalent URI.  It is not in 
their
power to do so.  That was the feedback I gave on the TAG list.

> 3. 2nd-last para, last sentence.  Also, see section 3.3.  Here it says 
> "non-hierarchical path will be treated as opaque data."  What this and 
> section 3.3 are really saying is that the 'segment' nonterminal is 
> opaque to generic URI parsers.  Given that, the result that a path 
> containing only one segment is also opaque falls out.  The draft never 
> really comes out and says this about the 'segement'; when I read this 
> in 3., I thought "well of course, why are they saying that", and I 
> think that if you did make this clear about segments this sentence 
> might become superfluous.

Historical justification: removal of the opaque-part ABNF terminal.
It is superfluous from a technical sense, but there are existing specs
dependent on this one that will be updated, and that text is intended
to ease that process.  I don't think that simply saying segment is
opaque would accomplish that purpose.

> Section 3.1 and elsewhere.  Draft refers to '#' as crosshatch and '@' 
> as 'commercial at sign'.  I'm an old text geek and I don't really know 
> these names.  Might it be a good idea, for maximum reach, to adopt (or 
> at least mention) the Unicode names?

Changed to "number-sign".  It's better than octothorpe.

> 3.3, 2nd last para.  Why do we need this text about observed usages of 
> ';' and ',' and so on in segments?  I think some more clarity about 
> the principle of segment opacity is in order.  But I question the 
> utility of this whole paragraph.

Because people keep asking me for examples where reserved characters
may have a specific meaning within a component, but one that is not
defined by the generic syntax, yet is nevertheless enabled by it.
I'll move it down one para to the end of the section and add something
to say it is an example.

> 3.5 Here we have the notion of a "secondary resource" introduced.  I 
> think this is as good as any other way to talk about fragments, but 
> you're introducing new art, so why not define the term right here 
> officially and try to introduce it into the vocabulary, might help 
> make the lives of other specwriters easier.

I thought that's what I did.  How would you word such a definition?

> 3.5, 2nd para.  I find the first sentence really hard to understand, 
> and I'm not sure I believe what I think it's trying to say.  I suspect 
> the reason it exists is to set up for the 'Therefore, ' at the start 
> of the 2nd sentence.  Why bother, lose that and the Therefore and just 
> start out by saying that the interpretation of a fragment is 
> dependent, etc... that's just the way it is, whether we like it or not 
> we're stuck with it, so why just not say it?

I'll work on a better wording.

> 3.5, 3rd para, last sentence.  The word "defines" is wrong.  I suppose 
> if I say <foo id="x23"> I've kind of defined what #x23 means.  But if 
> someone uses an arbitrary XPath or byte-offset kind of fragment 
> identifier, the representation cannot usefully be said to have 
> "defined" its meaning.  Also, this is in direct conflict with the 
> later remarks that even though not transmitting fragment-IDs is 
> information-lossy, it preserves the right of users to point at 
> whatever they want to point at; which cannot be consistent with the 
> notion that the representation defines the meaning of the fragment.   
> I would say instead "... in a reprepresentation in whose context the 
> fragment identifier is meaningful".

okay

> 5.1.1, 2nd para; might it be appropriate to mention xml:base here? 
> After all HTML's base element gets its own Appendix C.  In which 
> context I should point out that in XHTML it's "base" not BASE.  
> Actually, it might be better just to lose appendix C.

Yes, we'll probably lose it.

....Roy
Received on Wednesday, 4 June 2003 19:57:44 UTC