- From: Paul Grosso <paul@paulgrosso.name>
- Date: Mon, 29 Dec 2014 10:42:19 -0600
- To: public-xml-core-wg@w3.org
In trying to research this, I added some references and comments below. On 2014-12-23 19:41, John Cowan wrote: > I received this email as an editor of XML 1.1. However, I think the > response should come from the WG. > > ----- Forwarded message from Alexey Neyman <stilor@att.net> ----- > > Date: Fri, 19 Dec 2014 23:49:33 -0800 > From: Alexey Neyman <stilor@att.net> > To: tbray@textuality.com, jeanpa@microsoft.com, cmsmcq@w3.org, > elm@east.sun.com, cowan@ccil.org > Subject: Question about normalization checking in XML 1.1 > > Hi, > > I am trying to understand which portions of a document conforming to > XML 1.1 are expected to be normalized. In the document entity, the At http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-document is the following production defining "document": [1] document ::= ( prolog element Misc* ) - ( Char* RestrictedChar Char* ) > specification [1] prescribes that the text matching the following > productions should be normalized: CData, CharData, content, Name, > Nmtoken. Normalization checking--including the above statement--is discussed at http://www.w3.org/TR/2006/REC-xml11-20060816/#sec-normalization-checking The commentor's mention of "parent elements" below is pointing out that--while most elements' attribute specifications' AttValue (production [10]) end up being parsed as a result of parsing "content" (via content [43] -> element [39] -> STag [40] -> Attribute [41] -> AttValue [10])--the document element's attribute specifications cannot be reached via "content" or any of the other productions mentioned in this Normalization checking statement. > > This seems somewhat inconsistent: as far as I understand, it would > mean that the attribute values for non-root elements should be > normalized (because they match the 'content' production for their > respective parent element), but the attributes for the root element > may not be normalized (because the root element does not have a parent > element). At http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-content is the following production for "content": [43] content ::= CharData? ((element | Reference | CDSect | PI | Comment) CharData?)* > > Likewise, it seems to require that the whole content of the PIs inside > the root element is to be normalized (both the target and the > pseudo-attributes - because, again, the whole PI is a part of the > 'content' production of the enclosing element) - but for PIs at the > top-level (i.e. those that are part of the 'Misc' production, or those > inside a document type declaration), only the PI target is expected to > be normalized (since the target matches the 'Name' production and the > rest of the PI content is expressed via 'Char' production. At http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-Misc is the following production for "Misc": [27] Misc ::= Comment | PI | S Again, while PI's within any element get reached as a result of the "content" production, PI's within Misc are not reached via any of the productions mentioned in the Normalization checking statement. > > Am I missing anything? If not, could you please explain the rationale > for this apparent inconsistency? Unless I am also missing something, it does look like an inconsistency to me, and I suspect the inconsistency is accidental. What do others think (1) is the case and (2) should be the case? paul > > Thanks, > Alexey. > > [1] http://www.w3.org/TR/2006/REC-xml11-20060816/ > > ----- End forwarded message ----- >
Received on Monday, 29 December 2014 16:42:52 UTC