- From: Tex Texin <tex@i18nguy.com>
- Date: Sun, 25 May 2003 14:06:50 -0400
- To: François Yergeau <francois@yergeau.com>
- CC: GEO <public-i18n-geo@w3.org>
that's great, thanks Francois.
base64 is probably not practical for this situation (records too large).
but escaping is reasonable.
tex
François Yergeau wrote:
>
> Tex Texin a écrit:
> > The XML spec allows for Unicode characters from space (20) and above and #x9 |
> > #xA | #xD. Various existing applications make use of "characters" below 20 for
> > various reasons. Since they are not allowed in XML, what is the recommended
> > way to represent them?
>
> I don't think there's any official recommendation, but a useful way to
> deal with this situation is to base64-encode the whole string (e.g. the
> content of a specific element) that may contain controls. Of course,
> the receiving application must know to decode the encoding to recover
> the intended content.
>
> > Note the applications that use these chars want to efficiently write them out
> > and read them in, and want to exchange the data with other apps easily.
> > Escaping them as &#xhhhh; is not an option, nor is cdata, as they both
> > reference the production rules for char.
>
> XML 1.1 allows them as &#xhhhh;, except for U+0000 NULL. See
> http://www.w3.org/TR/2002/CR-xml11-20021015/#sec4.1.
>
> --
> François
--
-------------------------------------------------------------
Tex Texin cell: +1 781 789 1898 mailto:Tex@XenCraft.com
Xen Master http://www.i18nGuy.com
XenCraft http://www.XenCraft.com
Making e-Business Work Around the World
-------------------------------------------------------------
Received on Sunday, 25 May 2003 14:08:22 UTC