W3C home > Mailing lists > Public > xml-editor@w3.org > January to March 2013

Re: Is there a tool which tells me if my XML is "fully normalized"?

From: Paul Grosso <paul@paulgrosso.name>
Date: Wed, 20 Feb 2013 11:12:21 -0600
Message-ID: <512503F5.5010902@paulgrosso.name>
To: "Costello, Roger L." <costello@mitre.org>
CC: "xml-editor@w3.org" <xml-editor@w3.org>
Roger,

The XML Core WG is discussing your email.

We are not aware of any "fully normalized checking" tool,
but if such a tool exists for "text files", it should
apply equally to XML documents.

Meanwhile the rest of your posting has raised some
interesting issues that we are investigating. Given
our meeting schedule, it may take several weeks before
we can give you much more of an answer.

paul

Paul Grosso for the XML Core WG


On 2013-02-16 16:56, Costello, Roger L. wrote:
> Hi Folks,
>
> 1. Is there a tool which evaluates an XML document and returns an indication of whether it is fully normalized or not?
>
> 2. This element:
>
> 	<comment>&#x338;</comment>
>
> is not fully normalized, right? (Since the content of the <comment> element begins with a combining character and "content" is defined to be a "relevant construct.") Note: hex 338 is the combining solidus overlay character.
>
> 3. Section 2.13 of the XML 1.1 specification says:
>
> 	XML applications that create XML 1.1 output from either XML 1.1 or 
> 	XML 1.0 input SHOULD ensure that the output is fully normalized
>
> What should an XML application output, given this non-fully-normalized input:
>
> 	<comment>&#x0338;</comment>
>
> How does an XML application "ensure that the output is fully normalized"?
>
> 4. If the combining solidus overlay character follows a greater-than character in element content:
>
> 	<comment> &gt;&#x0338; </comment>
>
> then normalizing XML applications will combine them to create the not-greater-than character:
>
> 	<comment>  </comment>
>
> However, if the combining solidus overlay character follows a greater-than character that is part of a start-tag:
>
> 	<comment>&#x0338;</comment>
>
> then normalizing XML applications do not combine them:
>
>  	<comment>/</comment>
>
> There must be some W3C document which says, "The long solidus combining character shall not combine with the '>' in a start tag but it shall combine with the '>' if it is located elsewhere." 
>
> I have searched the W3C documents looking for a statement of this "rule" and have been unsuccessful in finding it. I am hoping that you will point me to the W3C document which states this rule?
>
> /Roger
Received on Wednesday, 20 February 2013 17:13:43 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 20 February 2013 17:13:45 GMT