- From: Tex Texin <tex@i18nguy.com>
- Date: Sun, 25 May 2003 14:06:50 -0400
- To: François Yergeau <francois@yergeau.com>
- CC: GEO <public-i18n-geo@w3.org>
that's great, thanks Francois. base64 is probably not practical for this situation (records too large). but escaping is reasonable. tex François Yergeau wrote: > > Tex Texin a écrit: > > The XML spec allows for Unicode characters from space (20) and above and #x9 | > > #xA | #xD. Various existing applications make use of "characters" below 20 for > > various reasons. Since they are not allowed in XML, what is the recommended > > way to represent them? > > I don't think there's any official recommendation, but a useful way to > deal with this situation is to base64-encode the whole string (e.g. the > content of a specific element) that may contain controls. Of course, > the receiving application must know to decode the encoding to recover > the intended content. > > > Note the applications that use these chars want to efficiently write them out > > and read them in, and want to exchange the data with other apps easily. > > Escaping them as &#xhhhh; is not an option, nor is cdata, as they both > > reference the production rules for char. > > XML 1.1 allows them as &#xhhhh;, except for U+0000 NULL. See > http://www.w3.org/TR/2002/CR-xml11-20021015/#sec4.1. > > -- > François -- ------------------------------------------------------------- Tex Texin cell: +1 781 789 1898 mailto:Tex@XenCraft.com Xen Master http://www.i18nGuy.com XenCraft http://www.XenCraft.com Making e-Business Work Around the World -------------------------------------------------------------
Received on Sunday, 25 May 2003 14:08:22 UTC