Re: 1343 messages later from Al Gilman on 2000-06-15 (xml-uri@w3.org from June 2000)

From: Al Gilman <asgilman@iamdigex.net>
Date: Thu, 15 Jun 2000 11:21:28 -0500
To: James Clark <jjc@jclark.com>, xml-uri@w3.org
Message-Id: <200006151507.LAA1493462@smtp1.mail.iamworld.net>
At 04:41 PM 2000-06-15 +0700, James Clark wrote:
>David Carlisle wrote:
>> 
>> > Reading this again, I'm realizing the genius of Microsoft's proposal,
which
>> > lets Tim and I both feel right,
>> 
>> Isn't that just because it is so vague it can mean anything to anyone,
>
>That's my feeling too at the moment.  As far as I can tell, all it does
>is point out that some cases are easy, but it doesn't help with the hard
>cases.  In comparing two namespace names, we can distinguish four cases:

I know a hard binary identity test would feel better.  I believe that the
notion that it is required is a misconception.  One operational definition
of an appropriate, managed level of indefinition goes something like the
following:

Lower layers, particularly if they want to proceed in ignorance of BASEs or
descriptive resources associated with markup vocabularies, should not
operate on the basis of a single, two-sided equality test but deal
separately with two one-sided checks: distinguishable, identical.

The semantics of these names is

  distinguishable: known to be different, conclusive unequal
  identical: known not to be different, conclusive equal

You can't delare 'identical' if either namespace name is relative.  

You can declare 'distinguishable' by tail-comparison regardless of the
relative property of either namespace name compared.

You can pass the attribute non-collision test if the Qnames of the
attributes are distinguishable.

You can throw an error if an attribute collision is such that the Qnames
are identical.

You should throw a warning if neither is true.  

Parsers may limit their processing to literal comparison and thereby throw
a few more warnings.

The InfoSet always saves all the information (BASE separate from ns-attr).

Binding to processing is done in the upper layers which can set their own
criteria (literal, absolute, or recover) for matching vocabularies to their
flavor of processing.  Note that the upper layers never, in this process,
override or invalidate a 'distinguishable' or 'identical' determined below.

Al

>
>1. both names are absolute; this case is easy since both the literal and
>absolutize approaches give the same answer.
>
>2. both names are relative and both names occur in the same entity and
>thus have the same base URI; in this case the literal and absolutize
>approaches differ only on how "." and ".." path segments are treated;
>also the literal approach in this case doesn't give rise to any cases
>where namespace names are namespace equal but refer to different
>resources; I think the Microsoft proposal is proposing the literal
>approach in this case (but I'm not sure)
>
>3. one name is relative and one is absolute; here the literal and
>absolute approaches give completely different answers; I don't know what
>the Microsoft proposal is proposing here; the literal approach here is
>not too bad here because again in this case it doesn't give rise to any
>cases where namespace names are namespace equal but refer to different
>resources.
>
>4. both names are relative but the namespace declarations have different
>base URIs; here the literal and absolute approaches give completely
>different answers; this is the really controversial case because the
>literal approach here can treat as equal two namespace names that refer
>to different resources (which is anathema to the absolutizers)
>
>A proposal that doesn't say clearly what happens in case 4 doesn't get
>us anywhere. It has to be answered by the namespaces Rec, because it can
>arise within a single XML document when there are external entities.
>
>Possible answers for case 4 include:
>
>A.  They are considered equal if they are character-for-character
>identical after absolutization (the absolute approach)
>
>B.  They are considered equal if the namespace names are
>character-for-character identical regardless of the base URI (the
>literal approach)
>
>C.  They are considered not to be equal in this case
>
>What is it?
>
>James
>
Received on Thursday, 15 June 2000 11:04:57 UTC