- From: Roy T. Fielding <fielding@apache.org>
- Date: Tue, 29 Apr 2003 17:27:16 -0700
- To: noah_mendelsohn@us.ibm.com
- Cc: Dan Connolly <connolly@w3.org>, WWW-Tag <www-tag@w3.org>
> I wonder whether we need to distinguish the license we might give an > application to do normalization, vs. any latitude in the mechanisms of > XML > and Namespaces themselves. Consider this example: > > <e p:a="1" q:a="2" xmlns:p="http://example.org/x" > xmlns:q="http://EXAMPLE.ORG/x" /> > > Does this or does it not violate the Uniqueness of Attribute > constraint of > Namespaces 1.1 [1]? I hope we have an unambiguous answer to that > question. Roy, are you implying that there should be lattitude for > some > processors to accept the document and others not? I suggest that for > Uniqueness of Attributes and similar purpose we need a single, > interoperable answer. The document is either OK or it's not. My > preferred answer would be "strcmp applies, the above document is OK". > In > that sense, the namespaces 1.1 CR is OK as it stands, I think. Well, that's several questions. The definition in the spec says that comparison is done as strings and they are identical if the strings are identical. As such, the nature of what is in those strings simply does not matter and need not be specified at all. However, I find the whole concept to be unappealing to say the least. What, may I ask, is the purpose of the Uniqueness of Attribute constraint? Is it to prevent a) syntactic collisions between attributes; or b) semantic collisions between attributes? I would claim it exists to prevent BOTH types of collisions. Therefore, the specification is doing the protocol a disservice by not requiring that the identifiers be different (rather than simply requiring that they be different strings). But that is a much longer discussion which I am happy to stay out of the loop. My objection to that section is the statement that the identifiers given are "different for the purposes of ...", which is simply false because they are identifiers and not mere strings. If it said that the following strings are different, then at least it wouldn't be abusing the semantics of URIs, even though it would still be failing to ensure that the attributes are actually from different namespaces. Note that it isn't necessary for applications to enforce every requirement in the specification. It is quite reasonable for XML to say the attributes must be distinct identifiers and yet only require processors to ensure that they are distinct strings. The first is a requirement on generators and the second a requirement for implementations. Note also that the following is a normalizer that I have actually used in practice: perl -pi -e 's/Apache\.Org/apache.org/g;' *.xml and I don't care whether or not there is some theoretical screw case in which some author used differences in case to trick XML into accepting a document that should have been invalid in the first place. I don't want the specification to tell me that using two equivalent URIs as xmlns attributes in order to force the parser to accept an ambiguous use of attributes (for what purpose I can't imagine) is more important than my right to normalize all equivalent references to URIs regardless of where they are used. ....Roy
Received on Tuesday, 29 April 2003 20:31:15 UTC