Re: greek char in UTF-8 (part 2) from Chris Lilley on 2000-05-09 (www-international@w3.org from April to June 2000)

From: Chris Lilley <chris@w3.org>
Date: Tue, 09 May 2000 17:49:16 +0200
To: "Albert Lunde by way of Martin J. Duerst <duerst@w3.org>" <Albert-Lunde@northwestern.edu>
CC: www-international@w3.org
Message-ID: <3918337C.160428FB@w3.org>

"Albert Lunde by way of Martin J. Duerst " wrote:

> Numeric character references are NOT (correctly) interpreted according
> to the character encoding (a.k.a. charset) used to store or transmit
> the document, but rather according to the SGML "Document Character
> Set" which is always Unicode, or I think more precisely, ISO-10646.

The two have equivalent character repertoires and code points.

> So with user agents that pay any attention to the standards, you can
> use Unicode numeric character references in a document represented in
> ANY character encoding. (Getting usable fonts is not solved by this
> fact, however.)

Yes, (and correct, though it does mean that fonts can use whatever encoding
they like so long as there is a way to map it to Unicode; for example if
you regularly get Russian pages n three different encodings, you don't need
three fonts with the same glyphs but arranged differently, one for each
encoding).

> I forget what version of HTML/XML this started with,

2.0

--
Chris

Received on Tuesday, 9 May 2000 11:50:01 UTC