W3C home > Mailing lists > Public > xml-editor@w3.org > April to June 2008

XML Recommendation Inconsistencies Regarding Leading White Space in Well-Formed Documents

From: Steve Fogoros <sfogoros@hsc.unt.edu>
Date: Fri, 27 Jun 2008 16:31:04 -0500
Message-Id: <486515C8.C2A1.0037.0@hsc.unt.edu>
To: <xml-editor@w3.org>
There appears to be some difficulty interpreting the Recommendation's
specification regarding leading white space that occurs prior to the xml
declaration as being prohibited or well-formed. Researching the Internet
indicates that leading white space is a frequent error at the
application level. In discussions on expat mailing list, it is claimed
that expat, i.e., is following the XML recommendation as specified
regarding leading white space in that it is not allowed. Typically,
productions [22] prolog, and [23] XMLDecl, are cited as the formal
specification that prohibits leading white space.
On reviewing the latest XML recommendation (Fifth Edition), I found
this to be not true. Section 2.4 (as far back as the Second Edition) is
very clear that any white space at the top level of the document entity
can exist in a well-formed xml document. I found other sections that
support this. If this email leads to further discussions, I will be
happy to enumerate in detail.
I did find one reference in Section F Autodetection of Character
Encodings (Non-Normative), that stated '... the XML encoding declaration
is restricted in position and content in order ...', but nowhere else in
the recommendation exists such a restriction, except in Section F.1
Detection Without External Encoding Information, where it states,
'Because each XML entity not accompanied by external encoding
information and not in UTF-8 or UTF-16 encoding must begin with an XML
encoding declaration, in which the first characters must be '<?xml',
....'. As this is a Non-Normative exception case, I don't interpret it as
a restriction in position and content of the normative case.
Depending on the intent of the recommendation regarding leading white
space being prohibited or well-formed, I would like to contribute
suggestions that make this more concise. 
Steve Fogoros
Manager of Academic Systems and Programming
Academic Information Services
University of North Texas Health Science Center

** Confidentiality Notice: This e-mail and any files transmitted with it are confidential to the extent permitted by law and intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error please notify the originator of the message and destroy all copies. **
Received on Saturday, 28 June 2008 16:23:18 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:37:46 UTC