Question about normalization checking in XML 1.1

[Forwarding to the xml-editor list. paul]


----- Forwarded message from Alexey Neyman <stilor@att.net> -----

Date: Fri, 19 Dec 2014 23:49:33 -0800
From: Alexey Neyman <stilor@att.net>
To: tbray@textuality.com, jeanpa@microsoft.com, cmsmcq@w3.org,
	elm@east.sun.com, cowan@ccil.org
Subject: Question about normalization checking in XML 1.1

Hi,

I am trying to understand which portions of a document conforming to
XML 1.1 are expected to be normalized. In the document entity, the
specification [1] prescribes that the text matching the following
productions should be normalized: CData, CharData, content, Name,
Nmtoken.

This seems somewhat inconsistent: as far as I understand, it would
mean that the attribute values for non-root elements should be
normalized (because they match the 'content' production for their
respective parent element), but the attributes for the root element
may not be normalized (because the root element does not have a parent
element).

Likewise, it seems to require that the whole content of the PIs inside
the root element is to be normalized (both the target and the
pseudo-attributes - because, again, the whole PI is a part of the
'content' production of the enclosing element) - but for PIs at the
top-level (i.e. those that are part of the 'Misc' production, or those
inside a document type declaration), only the PI target is expected to
be normalized (since the target matches the 'Name' production and the
rest of the PI content is expressed via 'Char' production.

Am I missing anything? If not, could you please explain the rationale
for this apparent inconsistency?

Thanks,
Alexey.

[1] http://www.w3.org/TR/2006/REC-xml11-20060816/

----- End forwarded message -----

Received on Monday, 29 December 2014 16:06:24 UTC