Re: Grinding to a halt on Issue 27. from Roy T. Fielding on 2003-04-30 (www-tag@w3.org from April 2003)

From: Roy T. Fielding <fielding@apache.org>
Date: Tue, 29 Apr 2003 17:27:16 -0700
To: noah_mendelsohn@us.ibm.com
Cc: Dan Connolly <connolly@w3.org>, WWW-Tag <www-tag@w3.org>
Message-Id: <7F41C6DF-7AA2-11D7-982A-000393753936@apache.org>

> I wonder whether we need to distinguish the license we might give an
> application to do normalization, vs. any latitude in the mechanisms of 
> XML
> and Namespaces themselves.  Consider this example:
>
>         <e p:a="1" q:a="2" xmlns:p="http://example.org/x"
> xmlns:q="http://EXAMPLE.ORG/x" />
>
> Does this or does it not violate the Uniqueness of Attribute 
> constraint of
> Namespaces 1.1 [1]?  I hope we have an unambiguous answer to that
> question.  Roy, are you implying that there should be lattitude for 
> some
> processors to accept the document and others not?  I suggest that for
> Uniqueness of Attributes and similar purpose we need a single,
> interoperable answer.  The document is either OK or it's not.  My
> preferred answer would be "strcmp applies, the above document is OK".  
> In
> that sense, the namespaces 1.1 CR  is OK as it stands, I think.

Well, that's several questions.  The definition in the spec says that
comparison is done as strings and they are identical if the strings are
identical.  As such, the nature of what is in those strings simply does
not matter and need not be specified at all.  However, I find the whole
concept to be unappealing to say the least.  What, may I ask, is the
purpose of the Uniqueness of Attribute constraint?  Is it to prevent

   a) syntactic collisions between attributes; or
   b) semantic collisions between attributes?

I would claim it exists to prevent BOTH types of collisions.  Therefore,
the specification is doing the protocol a disservice by not requiring
that the identifiers be different (rather than simply requiring that
they be different strings).  But that is a much longer discussion which
I am happy to stay out of the loop.

My objection to that section is the statement that the identifiers
given are "different for the purposes of ...", which is simply false
because they are identifiers and not mere strings.  If it said that
the following strings are different, then at least it wouldn't be
abusing the semantics of URIs, even though it would still be failing
to ensure that the attributes are actually from different namespaces.

Note that it isn't necessary for applications to enforce every
requirement in the specification.  It is quite reasonable for
XML to say the attributes must be distinct identifiers and yet only
require processors to ensure that they are distinct strings.
The first is a requirement on generators and the second a
requirement for implementations.

Note also that the following is a normalizer that I have actually
used in practice:

    perl -pi -e 's/Apache\.Org/apache.org/g;' *.xml

and I don't care whether or not there is some theoretical screw case
in which some author used differences in case to trick XML into
accepting a document that should have been invalid in the first place.

I don't want the specification to tell me that using two equivalent
URIs as xmlns attributes in order to force the parser to accept an
ambiguous use of attributes (for what purpose I can't imagine) is
more important than my right to normalize all equivalent references
to URIs regardless of where they are used.

....Roy

Received on Tuesday, 29 April 2003 20:31:15 UTC