Re: 'Leading White Space' Topic

I so much want to agree, and I wish the recommendation to be concise on
this. I'm reading XML 1.0, Fifth Edition, Section 2.4. Here is a
cut/paste of the first paragraph:
 
Text consists of intermingled character data and markup. [Definition:
Markup takes the form of start-tags, end-tags, empty-element tags,
entity references, character references, comments, CDATA section
delimiters, document type declarations, processing instructions, XML
declarations, text declarations, and any white space that is at the top
level of the document entity (that is, outside the document element and
not inside any other markup).]

It says that '... any white space that is at the top level of the
document entity (that is, outside the document element and ...' is
markup and it allowed.
 
Production [1] defines the document element as document ::= prolog
element Misc*
 
I understand this to mean that 'any white space' outside the document
element includes any white space before the prolog. How could this be
interpreted any other way?
 
Steve Fogoros

>>> Daniel Veillard <veillard@redhat.com> 9/17/2009 2:43 PM >>>
On Thu, Sep 17, 2009 at 01:42:06PM -0500, Steve Fogoros wrote:
> On 27 June, 2008, I wrote to xml-editor@w3.org regarding XML
> Recommendation (V1.0, Editions 2-5) description of how leading white
> space is defined in well-formed documents. I contend that the
> recommendation allows leading white space; that is white space
before
> the prolog.

where did you read that ? what section what paragraph ?

> Yet, many implementations fail to consider an XML document
> with leading white space as well-formed, and claim productions [22],
> [23], and [1] completely describe their implementation [while also
> relying on the non-normative section F]. Section 2.4 clearly
describes
> any white space outside the document entity as markup and is
allowed.

  Production [1] defines what a document is and leading white space
before the XMLDecl are forbidden (an optional BOM is not really a
white space but an encoding indication). If you have no XMLDecl you
can stack all the whitespaces you want as they are consumed by [27]
Misc, but that's definitely *in* the prolog, not before.

  If you have spaces before your document, you will have to discard
them
outside of the parsing process. I totally agree with the expat devel
on this. Any other interpretation is clearly contradicting the spec.

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit 
http://xmlsoft.org/ 
daniel@veillard.com  | Rpmfind RPM search engine http://rpmfind.net/ 
http://veillard.com/ | virtualization library  http://libvirt.org/ 




** Confidentiality Notice: This e-mail and any files transmitted with it are confidential to the extent permitted by law and intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error please notify the originator of the message and destroy all copies. **

Received on Thursday, 17 September 2009 20:32:40 UTC