Next message: Terje Bless: "Re: Using the validation service to validate XHTML"
From: Terje Bless <link@pobox.com>
To: david@prs-ltsn.leeds.ac.uk
Cc: www-validator@w3.org
Date: 28 Sep 2001 11:08:08 +0200
Message-Id: <1001668089.1565.22.camel@tux>
Subject: Re: unicode
On Wed, 2001-09-26 at 11:37, David J Mossley wrote:
> I have been trying to validate my webpages, in particular:
>
> http://www.prs-ltsn.leeds.ac.uk/generic/qualenhance/index.html
>
> However, the validator does not recognise the unicode number —
> for em dash.
—, is not the emdash, it is U+0097[0]. The character you are
looking for is U+2014/$#8212/—/"EM DASH". Windows 1252 does
define character number 151 -- 0x97 -- as an EM DASH[1], but the
Document Character Set for HTML is UNICODE regardless of what the
local platform uses or what the Content Transfer Encoding is (ie.
charset param from the HTTP Content-Type header or META element).
IOW, you want to replace occurrences of "—" with "—"
or "—".
[0] - <URL:http://www.eki.ee/letter/chardata.cgi?dcode=151>.
[1] - <URL:http://www.microsoft.com/globaldev/reference/sbcs/1252.htm>.