W3C home > Mailing lists > Public > public-xml-er@w3.org > February 2012

What about characters outside of those expressible in XML

From: Tony Graham <tgraham@mentea.net>
Date: Tue, 21 Feb 2012 17:04:31 -0000 (GMT)
Message-ID: <11643.83.147.131.233.1329843871.squirrel@mail3.webfaction.com>
To: public-xml-er@w3.org
>From http://lists.w3.org/Archives/Public/public-xml-er/2012Feb/0051.html,
David Carlisle wrote:
>On 20/02/2012 14:59, Anne van Kesteren wrote:
...
> > HTML replaces
>> U+0000 in the input with U+FFFD. Most other code points are preserved as
>> is if I remember correctly.
>
>which is compatible with XML 1.1 but not 1.0, I suppose at some point

U+FFFD in a name could still cause problems downstream.

Acceptability of U+FFFD depends on context and supported XML 1.0 edition:
U+FFFD is allowed in text as far back as XML 1.0 [1]; it's allowed in
names in XML 1.0 5th ed. [2], but not earlier [3].

>(not today probably:-) we need to take a decision on what compatibility
>with xml means.

Yes.

Regards,


Tony Graham                                   tgraham@mentea.net
Consultant                                 http://www.mentea.net
Mentea       13 Kelly's Bay Beach, Skerries, Co. Dublin, Ireland
 --  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --
    XML, XSL-FO and XSLT consulting, training and programming

[1] http://www.w3.org/TR/1998/REC-xml-19980210#charsets
[2] http://www.w3.org/TR/2008/REC-xml-20081126/#NT-NameStartChar
[3] http://www.w3.org/TR/2006/REC-xml-20060816/#NT-NameChar
Received on Tuesday, 21 February 2012 17:04:58 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 21 February 2012 17:04:59 GMT