* Sigurd Lerstad wrote: >DOM is always 2 bytes, what happens in an utf-8 file when you encounter a >character that uses 4 bytes (UCS-4), just ignore the two last bytes? Characters > U+FFFF are encoded using surrogate characters in UTF-16.Received on Friday, 25 July 2003 10:55:46 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 4 September 2006 18:11:23 GMT