Re: control codes

Tex Texin a écrit:
> The XML spec allows for Unicode characters from space (20) and above and #x9 |
> #xA | #xD. Various existing applications make use of "characters" below 20 for
> various reasons. Since they are not allowed in XML, what is the recommended
> way to represent them?

I don't think there's any official recommendation, but a useful way to 
deal with this situation is to base64-encode the whole string (e.g. the 
content of a specific element) that may contain controls.  Of course, 
the receiving application must know to decode the encoding to recover 
the intended content.

> Note the applications that use these chars want to efficiently write them out
> and read them in, and want to exchange the data with other apps easily.
> Escaping them as &#xhhhh; is not an option, nor is cdata, as they both
> reference the production rules for char.

XML 1.1 allows them as &#xhhhh;, except for U+0000 NULL.  See 
http://www.w3.org/TR/2002/CR-xml11-20021015/#sec4.1.

-- 
François

Received on Sunday, 25 May 2003 14:13:14 UTC