Comments on XML 1.0 5th edition

I know this is rather late in the day, but I haven't been following XML
specifications much recently.  I would like to draw your attention to a
couple of points about http://www.w3.org/TR/2008/PER-xml-20080205/ that you
might not have considered:

First, it still includes the definition:

[Definition: A *Name* is a token beginning with a letter or one of a few
> punctuation characters, and continuing with letters, digits, hyphens,
> underscores, colons, or full stops, together known as name characters.]


which is clearly not appropriate given the new definition of name.

Second, and much more importantly, XML Namespaces 1.0 defines
NCNameStartChar in terms of the XML 1.0 Letter production, which is still
defined in the 5th edition as it was in the 4th Edition. This implies that,
upon publication of XML 1.0 5th Edition, conformance to XML Namespaces 1.0
will require the first character of names to follow the 4th edition rules
and the following characters to follow the 5th edition rules!  Since most
specs and parsers these days require documents to conform to both XML 1.0
and XML Namespaces 1.0, the net result in practice of the 5th edition will
be that names in documents cannot take advantage of the 5th editions's
expanded character repertoire. (Of course, using XML Namespaces 1.1 is not
an option, because that references XML 1.1.)

This second point seems to me to be illustrative of a more fundamental
problem with the 5th edition.  Whilst in theory people writing specs that
reference XML 1.0 should have given careful consideration to whether to use
a dated or a non-dated reference, and should have consistently used one or
the other with a full appreciation of the potential consequences of this, in
practice I do not believe this has happened. Before you guys dreamed up the
5th edition, I don't think anybody would have anticipated that the
possibility of a change to the fundamental philosophy behind the selection
of allowed name characters in XML without changing the version number.  The
result is that many specs that reference XML 1.0 aren't prepared for such a
change. When you look at XML 1.0 by itself, I think there's a good case that
the benefits of the 5th edition are greater than its costs, but when you
consider the impact on XML 1.0 together with the whole universe of specs
that are built on top of XML 1.0, I think the scales clearly tip the other
way.

My suggestion would be to do an XML 1.2 that changes XML 1.0 only by making
the proposed 5th edition change to names, do a Namespaces 1.2 that
references XML 1.2, and then deprecate XML 1.1 and XML Namespaces 1.1. I
know that XML 1.1 didn't get much uptake, but I think that is partly because
it also included many other changes, whose usefulness was not nearly as
clear.

James

Received on Friday, 17 October 2008 03:05:34 UTC