Re: Character Entity Reference for Single Quote.

> Aha, yes. This is a case of the SGML declarations not quite matching the
> prose. To be fair, RFC 1866's references to ISO 10646 are largely in the
> context of extensions to and future versions of HTML, but it does
> nevertheless clearly state that all character references should correspond
> to ISO 10646. The (IMHO) rather vague prose of REC-html32 doesn't seem to
> address the subject at all. Is this minor confusion a result of lack of
> cooperation between SGML and Unicode back in 1995, or just a convenient
> semi-arbitrary decision?

I wasn't involved in the w3c process at all, but I'd guess that
the state of the HTML 3.2 spec has something to do with an 
intent to codify current practice.

The HTML 2.0 spec was delayed some time by internationalization 
issues; the language that's there reflects, IMHO, a compromise between :

(1) getting it out the door

(2) the strong position of ISO 8859-1 in prior specs and practice

(3) the desire to set a clear future direction towards ISO 10646
as the SGML "document character set", regardless of the character
encoding in use.

--
    Albert Lunde          Albert-Lunde@northwestern.edu (new address)
                          Albert-Lunde@nwu.edu (old address)

Received on Thursday, 25 January 2001 20:43:21 UTC