Date: Tue, 26 Nov 1996 19:52:38 +0100 (MET) From: Dirk.vanGulik@jrc.it To: Benjamin Franz <email@example.com> Cc: "Dirk.vanGulik" <Dirk.vanGulik@jrc.it>, firstname.lastname@example.org Subject: Re: HTML - i18n / NCR & charsets In-Reply-To: <Pine.LNX.3.95.961126103157.7652Bemail@example.com> Message-Id: <Pine.SOL.3.91.961126194926.8458Afirstname.lastname@example.org> On Tue, 26 Nov 1996, Benjamin Franz wrote: > On Tue, 26 Nov 1996, Dirk.vanGulik wrote: > > > > Some possible solutions are proposed: > > > > 1. An extended Content-type header is used. > > Content-type: text/html.i18n > > Content-type: text/html-i18n > > > > 2. An additional attribute to the charset is used > > Content-type: text/html; charset=iso-8859-1; ncr=iso-104.. > > > > 3. An additional (level) attribute to the text/html is used. > > Content-type: text/html; level=2; charset=iso8859-1 > > Content-type: text/html; version=2.0/i; charset=iso8859-1 > > > > 4. An additional DTD specifier in the HTML is insisted upon. > > <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 2.0i//EN"> > > > > 5. An additional header is added to signal that the site > > is internatialised. > > Content-Quality: i18n/v1.02 > > > > Please note that the effect accomplished by each of the above techniques > > are similar; they serve to inform the receiving end about the way any > > in-line numerical character references are to be treated. > > > > Option number 1 is by far the easiest to implement; and some of > > the deployed server and browser codes is able to tread this as > > an 'html' resource with a 'il8n; flavouring. > > No. Option 1 is by far the *most difficult* to actually deploy because > *most* existing browsers will attempt to download the now unknown file > type. This is why Roy's 'Proposed Transition Strategy for the Deployment > of Tables' never worked in practice. The other options at least don't > break (very many) existing browsers. You might be right here; I tried the five big ones in their last two versions, beta and shipping. They seemed to copy. But I agree that a 'level' type of addition o an header one is *much* safer in that respect, and I honestly do not know what is out there in the browser world. > > If HTML-i18n is to go ahead, without any signaling about the NCRs > > target charset change (i.e in Unicode rather than the announced > > charset); then IMHO this should at least be mensioned in the draft > > as it break existing, widespread, practice, which prior to this > > i18n draft could not be signalled as 'wrong' or 'illegal'. > > Hmmm...Is there actually a difference in the first 256 codes of Unicode > and ISO8859-1? I thought they were identical over that range? > There are just a few differences; mainly in the empty block which has the funny chars such as th bullet (143) and non-breaking-space (160) to name the popular offenders. DW.