W3C home > Mailing lists > Public > www-validator@w3.org > September 2001

Re: unicode

From: Terje Bless <link@pobox.com>
Date: 28 Sep 2001 11:08:08 +0200
To: david@prs-ltsn.leeds.ac.uk
Cc: www-validator@w3.org
Message-Id: <1001668089.1565.22.camel@tux>
On Wed, 2001-09-26 at 11:37, David J Mossley wrote:
> I have been trying to validate my webpages, in particular:
> 
> http://www.prs-ltsn.leeds.ac.uk/generic/qualenhance/index.html
> 
> However, the validator does not recognise the unicode number &#151; 
> for em dash.

&#151;, is not the emdash, it is U+0097[0]. The character you are
looking for is U+2014/$#8212/&mdash;/"EM DASH". Windows 1252 does
define character number 151 -- 0x97 -- as an EM DASH[1],  but the
Document Character Set for HTML is UNICODE regardless of what the
local platform uses or what the Content Transfer Encoding is (ie.
charset param from the HTTP Content-Type header or META element).

IOW,   you want to replace occurrences of "&#151;" with "&mdash;"
or "&#8212;".



[0] - <URL:http://www.eki.ee/letter/chardata.cgi?dcode=151>.
[1] - <URL:http://www.microsoft.com/globaldev/reference/sbcs/1252.htm>.
Received on Friday, 28 September 2001 05:08:22 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:13:59 GMT