Re: HTML - i18n / NCR & charsets

Benjamin Franz (
Tue, 26 Nov 1996 10:46:22 -0800 (PST)

Date: Tue, 26 Nov 1996 10:46:22 -0800 (PST)
From: Benjamin Franz <>
To: "Dirk.vanGulik" <>
Subject: Re: HTML - i18n / NCR & charsets
In-Reply-To: <9611261813.AA08437@>
Message-ID: <>

On Tue, 26 Nov 1996, Dirk.vanGulik wrote:
> Some possible solutions are proposed:
> 1. An extended Content-type header is used.
> 	Content-type: text/html.i18n
> 	Content-type: text/html-i18n
> 2. An additional attribute to the charset is used
> 	Content-type: text/html; charset=iso-8859-1; ncr=iso-104..
> 3. An additional (level) attribute to the text/html is used.
> 	Content-type: text/html; level=2; charset=iso8859-1
> 	Content-type: text/html; version=2.0/i; charset=iso8859-1
> 4. An additional DTD specifier in the HTML is insisted upon.
> 5. An additional header is added to signal that the site 
>    is internatialised.
> 	Content-Quality: i18n/v1.02
> Please note that the effect accomplished by each of the above techniques 
> are similar; they serve to inform the receiving end about the way any
> in-line numerical character references are to be treated.
> Option number 1 is by far the easiest to implement; and some of
> the deployed server and browser codes is able to tread this as
> an 'html' resource with a 'il8n; flavouring.

No. Option 1 is by far the *most difficult* to actually deploy because
*most* existing browsers will attempt to download the now unknown file
type.  This is why Roy's 'Proposed Transition Strategy for the Deployment
of Tables' never worked in practice. The other options at least don't
break (very many) existing browsers.

> If HTML-i18n is to go ahead, without any signaling about the NCRs
> target charset change (i.e in Unicode rather than the announced
> charset); then IMHO this should at least be mensioned in the draft
> as it break existing, widespread, practice, which prior to this
> i18n draft could not be signalled as 'wrong' or 'illegal'.

Hmmm...Is there actually a difference in the first 256 codes of Unicode
and ISO8859-1? I thought they were identical over that range?

Benjamin Franz