Re: Accept-Charset support from Martin J. Duerst on 1996-12-10 (www-international@w3.org from October to December 1996)

From: Martin J. Duerst <mduerst@ifi.unizh.ch>
Date: Tue, 10 Dec 1996 16:13:21 +0100 (MET)
To: Klaus Weide <kweide@tezcat.com>
cc: Jonathan Rosenne <rosenne@NetVision.net.il>, www-international@w3.org
Message-ID: <Pine.SUN.3.95.961210155539.245G-100000@enoshima>

On Sun, 8 Dec 1996, Klaus Weide wrote:

> On Sun, 8 Dec 1996, Jonathan Rosenne wrote:
> 
> > Yes there is - it is not a common or regular way to see Russian, and the
> > standards need to cater first and foremost for the common and regular and
> > only secondly to special needs such as those. And I am not sure that this
> > special need is appropriate for standardization at all.
> 
> I don't think the standards ought to cater only to the current needs of
> the majority.  There can be a concept or a vision behind them (the whole
> Web thing started out that way), which I would hope to be more stable than
> the questions "What's easiest to implement this year" or "What do most
> people want right now".

The main vision was to have each language be writable with the characters
it is usually written. This is at least 99.99% of usage. And we want
the browesers to cover as much of the characters of the world as possible.

Introducing too much of transliteration support would easily have led
to the impression that it's okay for a browser to just render ASCII,
because the servers will transliterate anyway, and who would want
to see Japanese written with Japanese characters if Latin letters
are so much easier to read (a very common oppinion among US programmers
and others, at least some years ago).

Reading a text with the original script is 99.99% of usage, and the
rest of it is split up into a large number of transliteration schemes
for a large number of scripts. There is not much use to burden the
protocol, the server, and in some cases the document author, with
something as rare and diverse as transliteration; it is better to
realize it as a separate service.

Also, please consider that in those cases where transliteration
is actually straightforward, the effort required by a human
reader to get used to a different script is likewise rather low.
For cases where learning to write takes much time, e.g. Japanese,
transliteration is about as difficult as machine translation.

> If there are insufficient means in the architecture to do character set
> labelling and negotiation then that should be fixed right *there* (by
> registering needed charsets or revising the MIME charset syntax or
> whatever).   

There are sufficient means in the architecture to do character
set labelling/negotiation. Registring "charset" parameter values
is not a question of architecture. If you need a "charset" parameter
for a particular use, you can register it yourself. But in my
oppinion, there are already more than enough "charset"s around.

> HTTP 1.1 says "A language tag identifies a natural language spoken,
> written, or otherwise conveyed by human beings for communication of
> information to other human beings."  Nearly identical words  are in
> draft-ietf-html-i18n-05.txt.  A language tag[*] is not bound to a
> specific language in either of those drafts.  I argue to keep it that
> way.

Of course not. These documents reference other documents.
But the binding exists, de is German, and fr is French, and so on.

> It could be done by the server, or on the client side, or by some
> intermediate agent (like a translation proxy/gateway).  The standards
> should not prevent the emergence of new services that would otherwise
> fit into the framework.

HTTP and URLs are extremely flexible, they don't prevent anything.

You can easily set up a translation or transliteration server
and pass the original URL and additional attributes in the query
part of the URL.

Regards,	Martin.

Received on Tuesday, 10 December 1996 10:14:39 UTC