Re: > in attribute values; decimal character reference

On Mon, Sep 3, 2012 at 9:56 PM, James Clark <jjc@jclark.com> wrote:

> The main reason that my drafts allowed > in attribute values was to
> increase the likelihood that the XML produced by non-MicroXML-aware XML
> tools would be well-formed MicroXML.
>
> Given an XML tree that only contains things that MicroXML allows, an XML
> serializer written in a natural way by a competent programmer is highly
> likely to always generate output that is well-formed MicroXML. The
> exceptions I can think of:
>
> a) > in attribute values; a serializer may very well have a different
> control path for serializing attribute values and character data in
> elements, because of the need to quote " and ' in attribute values; in this
> case, the attribute value control path is quite likely not to quote >
> (because it is unnecessary and would be extra code to do so)
>
> b) decimal character references; at the minimum a serializer needs to
> serialize a CR in an attribute value using a numerical character reference.
>  It's as reasonable to use &#13; as &#xD; for this.
>
> c) XML declaration
>
> Any others?
>

A thin one is the likelihood of a serializer, perhaps written by a
programmer in the Middle East or Asia to generate UTF-16 or UTF-32 without
an XML declaration.




> > in attribute values also provides compatibility with Canonical XML.
>
> I am not sure I find any of these arguments compelling but I thought I
> should mention them.
>

These seem reasonable, and I wonder about the arguments for banishing ">".
 If we do not support CDATA Sections there is nowhere ">" is unsafe, right?
 So it would be purely an argument of symmetry with "<".


-- 
Uche Ogbuji                       http://uche.ogbuji.net
Founding Partner, Zepheira        http://zepheira.com
http://wearekin.org
http://www.thenervousbreakdown.com/author/uogbuji/
http://copia.ogbuji.net
http://www.linkedin.com/in/ucheogbuji
http://twitter.com/uogbuji

Received on Tuesday, 4 September 2012 14:51:16 UTC