RE: Problem with LANG keyword

> From: Reuven Nisser <rnisser@ofek-liyladenu.org.il>
>
> Not to define the language at all is a possibility but I fill that it's
like
> throwing the water and the baby. I agree that defining Content-Language
META
> with the list of languages is enough for the W3C standard. But still W3C
> standard needs to define exactly which language to be use for each
character
> so we still need to define the rules for that.

To expect user agents to make such distinctions seems to me to be asking
them  to make too many assumptions given that there is no one-to-one
correspondence between script and language.  For instance, there is the
example
given earlier in this thread about numbers in a mixed English-Hebrew
document.
There is no simple common sense rule that a user agent could be given.


As a result, I don't think that such an ability should be added to HTML. 
At best, perhaps doing something like the following to indicate ranges
for languages could be considered for XHTML2.

<html>
  <head>
    <charlang value="he:"U+0030-U+0039,U+0590-U+05FF;en:U+0021-U+007E" />

But I don't like it.  and the more flexible alterative of adding a charlang
attribute to everything I like even less.  Trying to shoehorn this into the
existing lang and  xml:lang attributes has the problem that existing
implementations won't understand the new values, so that is clearly
not acceptable at all.

(Note:  The intention is to indicate character ranges using the same syntax
as that used in CSS2 for character ranges in fonts, and that a character
is bound to the first language encountered, thus eliminating the need for
breaking
up the range for the English characters into two discrete ranges and
providing
a defined behavior for when such double definition occurs.)

Received on Wednesday, 24 September 2003 14:13:38 UTC