Fwd: Is there a tool which tells me if my XML is "fully normalized"?

I'm not sure if we're supposed to be the experts here
or I18N or someone else, but I thought I'd forward
this to our list to see what others have to say.

Off hand, my thoughts are:

1. Don't know, but we just do theory, not tools.

2. True, that doesn't look normalized.

3. We don't discuss how applications do things, we just
tell them what they should do. (I realize that's a
bit facetious, but I'm not sure what the answer is here.)

4. First part, what applications do with the character content
in an XML document is not something the XML spec discusses.
Second part, you are confusing markup syntax with the
information set of the document, so you're question/comment
here makes no sense.

paul


-------- Original Message --------
Subject: 	Is there a tool which tells me if my XML is "fully normalized"?
Resent-Date: 	Sat, 16 Feb 2013 22:57:06 +0000
Resent-From: 	xml-editor@w3.org
Date: 	Sat, 16 Feb 2013 22:56:36 +0000
From: 	Costello, Roger L. <costello@mitre.org>
To: 	xml-editor@w3.org <xml-editor@w3.org>



Hi Folks,

1. Is there a tool which evaluates an XML document and returns an indication of whether it is fully normalized or not?

2. This element:

	<comment>&#x338;</comment>

is not fully normalized, right? (Since the content of the <comment> element begins with a combining character and "content" is defined to be a "relevant construct.") Note: hex 338 is the combining solidus overlay character.

3. Section 2.13 of the XML 1.1 specification says:

	XML applications that create XML 1.1 output from either XML 1.1 or 
	XML 1.0 input SHOULD ensure that the output is fully normalized

What should an XML application output, given this non-fully-normalized input:

	<comment>&#x0338;</comment>

How does an XML application "ensure that the output is fully normalized"?

4. If the combining solidus overlay character follows a greater-than character in element content:

	<comment> &gt;&#x0338; </comment>

then normalizing XML applications will combine them to create the not-greater-than character:

	<comment> ¡Û </comment>

However, if the combining solidus overlay character follows a greater-than character that is part of a start-tag:

	<comment>&#x0338;</comment>

then normalizing XML applications do not combine them:

 	<comment>/</comment>

There must be some W3C document which says, "The long solidus combining character shall not combine with the '>' in a start tag but it shall combine with the '>' if it is located elsewhere." 

I have searched the W3C documents looking for a statement of this "rule" and have been unsuccessful in finding it. I am hoping that you will point me to the W3C document which states this rule?

/Roger

Received on Monday, 18 February 2013 14:51:28 UTC