W3C home > Mailing lists > Public > www-xml-blueberry-comments@w3.org > June 2001

Re: XML Blueberry

From: John Cowan <jcowan@reutershealth.com>
Date: Fri, 22 Jun 2001 13:22:53 -0400
Message-ID: <3B337EED.7030305@reutershealth.com>
To: David Carlisle <davidc@nag.co.uk>
CC: "xml-dev@xml.org" <xml-dev@xml.org>, www-xml-blueberry-comments <www-xml-blueberry-comments@w3.org>
David Carlisle wrote:


> that's a rather thin argument for introducing a change that will likely
> make existing parsers all fail on a large proportion of new
> documents. (Because tools are likely to splash blueberry juice over new
> xml files, even if they don't really use the new features)


I, at least, would like to say that it's a Best Practice not to
generate a Blueberry mark unless you need it.  You can be sure
I will lobby to get such language into the eventual Blueberry rec.


> Personally I think it suspect that by far the most complicated
> production in the xml spec is the name constraints.


Well, it's lengthy; I don't know that it's complicated.
The exposition in XML 1.0 *is* unnecessarily complicated, and should
be just replaced with name-start-character and name-character
productions.

> Would so much be
> lost if (as for character data) all unicode code points above a certain 
> point were allowed, otherwise we'll have to go through this all again
> for 3.x and 4 and ..


Readability.  If people are allowed to use characters at U+30000 and up,
which will probably *never* be assigned for anything, then there is no
hope that such documents can be viewed with standard viewers or anything
of the sort.  Likewise, there will be no way to type such characters.

One idea I've had is to not allow archaic scripts into XML names, on the
grounds that there are no native Gothic speakers who need to tag their
text with Gothic tags.  (There are scholars who want to mark up Gothic
text with English or German or ... tags, but that's already covered.)
This would greatly reduce, though not to zero, the number of post-3.1
name characters to be introduced.

In particular, there would only be about 20-odd name characters in
Unicode 3.2, which could very well be deferred.

-- 
There is / one art             || John Cowan <jcowan@reutershealth.com>
no more / no less              || http://www.reutershealth.com
to do / all things             || http://www.ccil.org/~cowan
with art- / lessness           \\ -- Piet Hein
Received on Friday, 22 June 2001 13:22:57 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 22 March 2009 12:11:43 GMT