- From: Eric van der Vlist <vdv@dyomedea.com>
- Date: Sat, 15 Dec 2001 18:03:21 +0100
- To: www-xml-blueberry-comments@w3.org
(copy of a post on xml-dev) I don't feel like entering into the arena of discussing the need for the modification proposed by the first XML 1.1 WG since I don't feel qualified to speak about a problem which I have never personaly felt. I would rather note that it can be an opportunity to test the versioning of XML on a limited change and that there is probably lots of things to learn from this first version change. Let's first list all the impacts on applications using XML: a) Some documents which are well formed per XML 1.0 may not be well formed per XML 1.1 (as far as I can tell): per http://www.w3.org/TR/2001/WD-xml11-20011213/#sec2.13 : "2.13 W3C Normalization Checking [NEW] XML processors must/should/may check whether their input documents are in W3C normalized form, as defined by [Charmod]. XML processors must not transform the input to be in normalized form. It is a fatal error/error/not an error for the document not to be in normalized form." and http://www.w3.org/TR/charmod/#sec-TextNormalization gives an example of non normalized yet XML 1.0 valid snippet () b) Some documents which are not well formed per XML 1.1 may not be well formed per XML 1.1 (this is the already well discussed consequence of allowing more characters in names). c) The "same" text within an element of a XML file may be different if the file is a XML 1.0 or XML 1.1 document (since the EOL handling has been changed). d) The "same" attribute value in a XML file may be different if the file is a XML 1.0 or XML 1.1 document (since the attribute value normalization has been changed). Note: the WG also says that "each entity, including the document entity, can be separately declared as XML 1.0 or XML 1.1." which *seems* to allow, in a same document, to mix elements and attributes with both the new and the old EOL and attribute value handling and this seems like a weird thing to do. Having listed these 4 differences, I'd like to assert that most of the XML 1.0 well formed documents are XML 1.1 well formed and that I would expect that for a while, most of the XML 1.1 well formed documents will also be XML 1.0 well formed (people will probably use the new version number for their new documents even if they don't use extended names). Let's now have a look at what the WD says about the versioning: http://www.w3.org/TR/2001/WD-xml11-20011213/#sec2.8 "2.8 Prolog and Document Type Declaration Change "1.0" everywhere to "1.1" Add the following paragraph: XML 1.1 processors should accept XML 1.0 documents as well. If a document is well-formed or valid XML 1.0, it may be made well-formed or valid XML 1.1 respectively simply by changing the version number." Per (a), I *think* that the above statement is not true. I also think that it's far from being sufficient and that using a version 1.1 is serving two different purposes: e) declare that the names may contain a bunch of new characters (note that it's a "may", not a "must"). f) specify that the parser must use the new EOL and attribute value handling methods. For these two purposes, it seems to me that it would be very useful to let applications overide the version definition found in an instance document (exactly like it is useful to be able to overide a schema location) and let them say: process this 1.1 document as 1.0 (report errors if it's not 1.0 well formed and use the "old" handling), or: process this 1.0 document as 1.1 (report errors if it's not 1.1 well formed and use the "new" handling). I have then looked at a random set of specifications which could be affected by the change. John being an editor of both XML 1.1 and the XML infoset, no surprise with the infoset which should not be impacted. C14N contains a list of whitespaces "whitespace characters #x9, #xA, and #xD" which would need to be updated. XSLT contains also a list of whitespaces but XSLT would be more affected than that: the version can be specified in its XML output method, and the transformations to apply when the version of the XSLT stylesheet and|or a source document is different from the version of the output document: what should a XSLT processor do when an element or attribute name which 1.1 well formed but not XML 1.0 well formed is inserted in the output tree serialized by a XML 1.0 method? or when a text which is XML 1.0 well formed but not XML 1.1 well formed is inserted in an output tree processed by a XML 1.1 output method. The XPath specification has been wise enough to reference the whitespace definition of XML rather than redefining it. However, if a XPath processor wanted to give a different result for the normalize-space() function depending on the version of XML which is used, it would surely be a problem for it since reporting the XML version to applications is a feature which is being introduced in DOM Level 3 and is missing from both SAX 2.0 and DOM Level2 (how can the XPath processor guess the version of the document, then?). Finally, W3C XML Schema is also defining the list of whitespaces. Beyond the editorial change, changing the list would have strange effects similar to those mentioned for XSLT. The effect of a number of facets would be modified (facets working on XML 1.O documents may not work on XML 1.1 documents and vice versa, enumerations could be affected, the length of strings would give different results, ...). The effect of derivations by lists would also be affected (a list element in a XML 1.0 document might for instance be considered as two different lists elements in XML 1.1). Even the content models would be affected (content models considered as complex by XML 1.1 would be considered as mixed by XML 1.0). I am sure that the lists (of affected specifications and of effects on the ones I have mentioned) are much longer, but I have thought that it might be usefull to give this partial list to illustrate what I meant! Hope this helps. Eric -- Rendez-vous a Paris pour les Electronic Business Days 2002. http://www.edifrance.org/ebd/index.htm ------------------------------------------------------------------------ Eric van der Vlist http://xmlfr.org http://dyomedea.com http://xsltunit.org http://4xt.org http://examplotron.org ------------------------------------------------------------------------
Received on Saturday, 15 December 2001 12:03:27 UTC