Re: greek char in UTF-8 (part 2)

"Albert Lunde by way of Martin J. Duerst " wrote:

> Numeric character references are NOT (correctly) interpreted according
> to the character encoding (a.k.a. charset) used to store or transmit
> the document, but rather according to the SGML "Document Character
> Set" which is always Unicode, or I think more precisely, ISO-10646.

The two have equivalent character repertoires and code points.

> So with user agents that pay any attention to the standards, you can
> use Unicode numeric character references in a document represented in
> ANY character encoding. (Getting usable fonts is not solved by this
> fact, however.)

Yes, (and correct, though it does mean that fonts can use whatever encoding
they like so long as there is a way to map it to Unicode; for example if
you regularly get Russian pages n three different encodings, you don't need
three fonts with the same glyphs but arranged differently, one for each
encoding).

> I forget what version of HTML/XML this started with,

2.0

--
Chris

Received on Tuesday, 9 May 2000 11:50:01 UTC