RE: ISSUE-88 / Re: what's the language of a document ?

I have written an alternative to the change proposal from the I18N 
WG.[1] This change proposal takes in the issues related to Bug 9263 and 
9264. I hope that both Ian and the I18N WG also will consider the 
issues that I try to solve with this proposal, so that we can come to a 
consensus. Input is very welcome.

Quoting the summary of the proposal: 

 1. The HTML4/XHTML1 language inheritance problem – solve it: HTML5 
aligns the meaning of an empty lang="" with XML. Therefore it is 
necessary to solve the language inheritance problems of HTML4/XHTML1.0. 
(An empty lang="" is a syntax error in HTML4/XHTML1.1. Several browsers 
therefore go looking e.g. in the meta Content-Language element for a 
fallback language code.)
 2. The HTTP issue - unconfuse it: Do not disguise these language 
inheritance problems or create new problems (such as more confusion 
w.r.t. HTTP) by aligning the pragma content-language with lang=""
 3. The default language issue when multiple languages are set – define 
anew or drop it: We should either drop the idea about having rules for 
how to inherit language from the meta content-langauge element when it 
contains more than one language. Or we should define a new way to do 
so. Proposed solution to the latter: Specify that one may provide two 
meta content-language elements, where the first will (eventually) be 
used by HTTP, and the latter will be used by the parser. (All browsers 
that looks at the meta content-language element look at the last meta 
content-language element, only.) This solution is also what is needed 
to solve the language inheritance problem. 
 4. The first or the last meta content-language element? Give up the 
idea which is currently in the spec, that user agents should look at 
the first meta content-language element - currently they ALL look at 
the last element. (This fourth point is not a crucial part of this 
proposal, but it seems more aligned with reality.)

[1] 
http://www.w3.org/html/wg/wiki/ChangeProposals/lang_versus_contentLanguage


Leif Halvard Silli, Thu, 18 Mar 2010 12:45:42 +0100:
> Two bugs have been filed, that relates to this issue:
> 
> Bug 9263: Incorrect language determination algorithm
>           http://www.w3.org/Bugs/Public/show_bug.cgi?id=9263

> 
>           ("Incorrect" is perhaps too strong - but at least it
>           is imprecise.)
> 
> Bug 9264: Provide a way to prevent Content-Language from acting
>           as language fallback
>           http://www.w3.org/Bugs/Public/show_bug.cgi?id=9264

> 
> Related: replies to Addison Phillips [1][2] and to C.E. Whitehead [3].
> 
> [1] http://lists.w3.org/Archives/Public/public-html/2010Mar/0324

> [2] http://lists.w3.org/Archives/Public/public-html/2010Mar/0331

> [3] http://lists.w3.org/Archives/Public/public-html/2010Mar/0325

-- 
leif halvard silli

Received on Thursday, 18 March 2010 23:53:37 UTC