W3C home > Mailing lists > Public > www-international@w3.org > October to December 1996

Re: Accept-Charset support

From: Jonathan Rosenne <rosenne@NetVision.net.il>
Date: Sat, 07 Dec 1996 20:10:49 +0200
Message-Id: <1.5.4.32.19961207181049.006647d4@mail.netvision.net.il>
To: Klaus Weide <kweide@tezcat.com>
Cc: Larry Masinter <masinter@parc.xerox.com>, Chris.Lilley@sophia.inria.fr, www-international@w3.org
At 03:24 07/12/96 -0600, Klaus Weide wrote:
>On Thu, 5 Dec 1996, Larry Masinter wrote:
>
>[snipped from a longer message:]
>> I think the simple thing to do is to send:
>> 
>> 	accept-charset: utf-8,iso-8859-5
>> 
>> if you're a browser and can display utf-8 and 8859-5 as well as
>> 8859-1.  
>
>It seems more appropriate to say "...if you can decode utf-8 and display
>8859-5".  The problem is that "utf-8" doesn't carry any useful information
>about available character repertoire (whereas iso-8859-5 does) unless
>we assume that it will be normal for a browser (or other web client)
>to have _all_ of the 10646 characters available (in which case all 
>discussion about Accept-Charset would be rather pointless).

According to the 10646 and Unicode specifications, the user agent is not
obliged to be able to display all the characters. 

>If there is a need for a client to express "I can understand UTF-8,
>but can only display some of the 10646 characters: ..." - and I 
>definitely think there is such a need - I don not see a way to implement
>this cleanly.  This is a limitation of the MIME charset model which
>mixes character encoding and repertoire aspects ("charset considered
>harmful" etc...).  Or rather it is a limitation following from the fact
>that no more than a handful of "10646 sub-repertoire charsets" have
>been registered, for which the IANA registry file has reserved a range:
>
> "The second region (1000-1999) is for the Unicode and
>ISO/IEC 10646 coded character sets together with a specification of a
>(set of) sub-repetoires that may occur."

10646 does define several subsets. They appear not to have been registered
by IANA. They are language related, rather than vendor related.

The best solution to the problem raised is via "accept-language". It can be
reasonably assumed that if my preferred languages include French I can
display the French characters. 

If the server only has it in Japanese, it will be sent in Japanese, my
screen will be illegible just as it is today, so the situation in this case
will not improve but will not be worse. If the server does have alternative
languages, the situation will improve. In total, the two accept- features
represent a great improvement because they allow a much better situation
than currently available if the parties support them and don't make it worse
for those who do not.

--

Jonathan Rosenne
JR Consulting
P O Box 33641, Tel Aviv, Israel
Phone: +972 50 246 522 Fax: +972 9 956 7353
http://ourworld.compuserve.com/homepages/Jonathan_Rosenne/
Received on Saturday, 7 December 1996 13:10:48 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:46 GMT