W3C home > Mailing lists > Public > www-international@w3.org > January to March 1997

oops: Natural language marking in HTML

From: Larry Masinter <masinter@parc.xerox.com>
Date: Sat, 8 Mar 1997 09:31:35 PST
Message-Id: <3.0.1.32.19970308093135.006a6034@casablanca.parc.xerox.com>
To: www-international@w3.org
Sorry -- getting used to new mailer program.

>I think explicit rules are needed on what counts the majority
>language.  Example rules might include
>    [1] you can't understand enough of this to make sense unless
>        you're fluent in Japanese and Old Frsian
>    [2] 51% or more of the text characters in this document correspond
>        to Hindi, so that's the majority language
>    [3] 51% or more of the glyphs ....
>    [4] 51% or more of the pixels set at 100 dpi... :-)
>

What I meant to say (besides quoting Lee's example) is that
such examples convince me that trying to be objective about
Content-Language isn't really useful or realistic. There
are so many odd special cases that you can't write enough
explicit rules to cover all of the cases.

(Note that the HTML markup is an indication of how
to control local presentation, and doesn't have this
difficulty, and that the HTTP "Accept-Language"
headers are a way for the user to say what their preferences
are, and has a different set of difficulties, but their
at least more well understood.)

The best that you might be able to do is

"Content-language: FR" means "Whoever applied this tag
thinks the content is more intelligible to those
who know French than those who do not".


--
http://www.parc.xerox.com
Received on Saturday, 8 March 1997 12:32:04 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:47 GMT