- From: Steve Fogoros <sfogoros@hsc.unt.edu>
- Date: Thu, 17 Sep 2009 15:32:56 -0500
- To: <veillard@redhat.com>
- Cc: <public-xml-testsuite@w3.org>
- Message-Id: <4AB2565F.C2A1.0037.0@hsc.unt.edu>
I so much want to agree, and I wish the recommendation to be concise on this. I'm reading XML 1.0, Fifth Edition, Section 2.4. Here is a cut/paste of the first paragraph: Text consists of intermingled character data and markup. [Definition: Markup takes the form of start-tags, end-tags, empty-element tags, entity references, character references, comments, CDATA section delimiters, document type declarations, processing instructions, XML declarations, text declarations, and any white space that is at the top level of the document entity (that is, outside the document element and not inside any other markup).] It says that '... any white space that is at the top level of the document entity (that is, outside the document element and ...' is markup and it allowed. Production [1] defines the document element as document ::= prolog element Misc* I understand this to mean that 'any white space' outside the document element includes any white space before the prolog. How could this be interpreted any other way? Steve Fogoros >>> Daniel Veillard <veillard@redhat.com> 9/17/2009 2:43 PM >>> On Thu, Sep 17, 2009 at 01:42:06PM -0500, Steve Fogoros wrote: > On 27 June, 2008, I wrote to xml-editor@w3.org regarding XML > Recommendation (V1.0, Editions 2-5) description of how leading white > space is defined in well-formed documents. I contend that the > recommendation allows leading white space; that is white space before > the prolog. where did you read that ? what section what paragraph ? > Yet, many implementations fail to consider an XML document > with leading white space as well-formed, and claim productions [22], > [23], and [1] completely describe their implementation [while also > relying on the non-normative section F]. Section 2.4 clearly describes > any white space outside the document entity as markup and is allowed. Production [1] defines what a document is and leading white space before the XMLDecl are forbidden (an optional BOM is not really a white space but an encoding indication). If you have no XMLDecl you can stack all the whitespaces you want as they are consumed by [27] Misc, but that's definitely *in* the prolog, not before. If you have spaces before your document, you will have to discard them outside of the parsing process. I totally agree with the expat devel on this. Any other interpretation is clearly contradicting the spec. Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/ ** Confidentiality Notice: This e-mail and any files transmitted with it are confidential to the extent permitted by law and intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error please notify the originator of the message and destroy all copies. **
Received on Thursday, 17 September 2009 20:32:40 UTC