W3C home > Mailing lists > Public > www-validator@w3.org > August 2007

Re: non sgml characters

From: Andreas Prilop <Prilop2007@trashmail.net>
Date: Wed, 8 Aug 2007 17:59:05 +0200 (MEST)
To: www-validator@w3.org
Message-ID: <Pine.GSO.4.63.0708081749170.2143@s5b004.rrzn.uni-hannover.de>

On Tue, 7 Aug 2007, Jukka K. Korpela wrote:

> When you have octet 146 in a document declared to be iso-8859-1 encoded,
> it is interpreted as denoting a control code in the C1 Controls area. The
> meanings of those control codes have not been defined in the ISO 8859-1
> standard, but they correspond to the C1 Controls area of Unicode, so that
> e.g. 146 decimal (92 hexadecimal) maps to the Unicode character U+0092.

>From a practical point of view, the only control character from
this range that could cause trouble is U+0085 (next line).
 http://www.w3.org/TR/newline
 http://www.w3.org/TR/unicode-xml/#White

Character x85 in Windows-1252 is U+2026 "horizontal ellipsis".
Received on Wednesday, 8 August 2007 15:59:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:25 GMT