Re: WG Last Call on C14N (and attribute value escaping)

On Thu, 6 Jul 2000, Joseph M. Reagle Jr. wrote:

> Namespace and Attribute Nodes- a space, the node's QName, an equals sign, an
> open double quote, the modified string value, and a close double quote. The
> string value of the node is modified by replacing all ampersands (&) with
> &amp;, /+ all open angle brackets (<) with &lt;, +/  all double quote
> characters with &quot;, and the whitespace characters #x9, #xA, and #xD,
> with character references. The character references are written in uppercase
> hexadecimal with no leading zeroes (for example, #xD is represented by the
> character reference&#xD;). 

This is absolutely necessary; no <s can appear plain in attribute values,
and I think the omission is just an oversight.

> However, I'm unclear about not also including /+ all closing angle brackets
> (>) are replaced by &gt; +/ and about normalizing single-quote characters
> (') with "&apos;"... 

There is no need for either of these.  The gt entity is for use in PCDATA,
so that the forbidden "]]>" sequence can be written "]]&gt;", and the
apos entity is for use in attribute values delimited with apostrophes,
which Canonical XML does not permit.  So these substitutions should not
be made; they waste space and time.

-- 
John Cowan                                   cowan@ccil.org
	"You need a change: try Canada"  "You need a change: try China"
		--fortune cookies opened by a couple that I know

Received on Thursday, 6 July 2000 16:50:53 UTC