W3C home > Mailing lists > Public > www-svg@w3.org > July 2003

Re: utf-8

From: Sigurd Lerstad <sigler@bredband.no>
Date: Sat, 26 Jul 2003 22:20:13 +0200
Message-ID: <006201c353b3$524c7660$6e1273d5@mmstudio>
To: "Bjoern Hoehrmann" <derhoermi@gmx.net>
Cc: <www-svg@w3.org>

I don't understand what you mean.

In an XML file that says utf-8 in the xml declaration. There could be 4 byte
characters later in the file. How should those be treated to convert them to

Is there some spec which says what to do?

Am I missing something?


Sigurd Lerstad

> * Sigurd Lerstad wrote:
> >DOM is always 2 bytes, what happens in an utf-8 file when you encounter a
> >character that uses 4 bytes (UCS-4), just ignore the two last bytes?
> Characters > U+FFFF are encoded using surrogate characters in UTF-16.
Received on Saturday, 26 July 2003 16:18:21 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 8 March 2017 09:46:56 UTC