- From: David Carlisle <davidc@nag.co.uk>
- Date: Mon, 8 Dec 2003 10:25:50 GMT
- To: mrys@microsoft.com
- Cc: public-qt-comments@w3.org
> For the data model: the WG, otherwise the data model spec would be > different. Not necessarily, some things just slip through by accident, that's the point of a public review isn't it? Ideally Xquery would adopt some version of xsl:strip-space into its prologue and then the xslt and Xquery commands would be specified as passing a specified flag to the data model building which would cause white space text nodes to be dropped. Note that this only needs to apply to building a data model instance by parsing an XML file (the point of the section commented on in this thread) If the data model instance is coming from some other source (eg straight from a database or whatever, then its white space behaviour is out of scope for this spec, and I have no objection to that. The text clearly can not stand as it is. It is defined in terms of "insignificant white space" but this term is not defined in any spec that I have looked at (DM, XML rec, infoset. Although the xml spec says On the other hand, "significant" white space that should be preserved in the delivered version is common, for example in poetry and source code. This is juust an aside, and not part of any definition that can be referenced. It is not acceptable to leave open the interpretation of this definition of the implementor, especially as this thread has shown there are wide differences in interpretation. I for example believe that inter-word spaces in English language sentences are significant, but apparently Michael Rhys does not. If for some reason the working groups do want to define "insignificant white space" and allow implementations freedom to silently drop such spaces (sacrificing interoperability for some unspecified gain) then any definition will break the spirit of the XML recommendation which clearly states: An XML processor must always pass all characters in a document that are not markup through to the application. A validating XML processor must also inform the application which of these characters constitute white space appearing in element content. (Which I believe was chosen as the XML processing model to avoid the problems shown up after many years of sgml experience of problems with parsers trying to decide automatically which spaces to drop.) As Micahel Rhys indicated you may claim that you are following the letter of the specification if you claim that the parser is preserving the spaces (but not showing them to anybody or anything) but they are being dropped while building the datamodel instance. However this is clearly just a legalistic fudge that does not help the end user, and any browsing of xsl-list will quickly show that failure to achieve interoperability in this area does seriously inconvenience the end user. However if you really want to define this term I believe that the only workable definition would be the definition alluded to in the quotation above from the XML rec, white space appearing in element content. ie white space nodes appearing in elements _declared_ (in DTD, or now, schema) to take element (not mixed) content. Allowing processors to siently drop such spaces would still harm interoperability but at least it is unlikely to produce results that are simply wrong, such as losing inter word spaces in English. David ________________________________________________________________________ This e-mail has been scanned for all viruses by Star Internet. The service is powered by MessageLabs. For more information on a proactive anti-virus service working around the clock, around the globe, visit: http://www.star.net.uk ________________________________________________________________________
Received on Monday, 8 December 2003 05:27:44 UTC