W3C home > Mailing lists > Public > public-html@w3.org > March 2010

RE: ISSUE-88 / Re: what's the language of a document ?

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Fri, 19 Mar 2010 00:52:58 +0100
To: "ian@hixie.ch" <ian@hixie.ch>
Cc: "Phillips, Addison" <addison@amazon.com>, CE Whitehead <cewcathar@hotmail.com>, "www-international@w3.org" <www-international@w3.org>, "public-html@w3.org" <public-html@w3.org>, "ishida@w3.org" <ishida@w3.org>
Message-ID: <20100319005258419591.c0c24c91@xn--mlform-iua.no>
I have written an alternative to the change proposal from the I18N 
WG.[1] This change proposal takes in the issues related to Bug 9263 and 
9264. I hope that both Ian and the I18N WG also will consider the 
issues that I try to solve with this proposal, so that we can come to a 
consensus. Input is very welcome.

Quoting the summary of the proposal: 

	1.	The HTML4/XHTML1 language inheritance problem – solve it: HTML5 
aligns the meaning of an empty lang="" with XML. Therefore it is 
necessary to solve the language inheritance problems of HTML4/XHTML1.0. 
(An empty lang="" is a syntax error in HTML4/XHTML1.1. Several browsers 
therefore go looking e.g. in the meta Content-Language element for a 
fallback language code.)
	2.	The HTTP issue - unconfuse it: Do not disguise these language 
inheritance problems or create new problems (such as more confusion 
w.r.t. HTTP) by aligning the pragma content-language with lang=""
	3.	The default language issue when multiple languages are set – define 
anew or drop it: We should either drop the idea about having rules for 
how to inherit language from the meta content-langauge element when it 
contains more than one language. Or we should define a new way to do 
so. Proposed solution to the latter: Specify that one may provide two 
meta content-language elements, where the first will (eventually) be 
used by HTTP, and the latter will be used by the parser. (All browsers 
that looks at the meta content-language element look at the last meta 
content-language element, only.) This solution is also what is needed 
to solve the language inheritance problem. 
	4.	The first or the last meta content-language element? Give up the 
idea which is currently in the spec, that user agents should look at 
the first meta content-language element - currently they ALL look at 
the last element. (This fourth point is not a crucial part of this 
proposal, but it seems more aligned with reality.)

[1] 
http://www.w3.org/html/wg/wiki/ChangeProposals/lang_versus_contentLanguage


Leif Halvard Silli, Thu, 18 Mar 2010 12:45:42 +0100:
> Two bugs have been filed, that relates to this issue:
> 
> Bug 9263: Incorrect language determination algorithm
>           http://www.w3.org/Bugs/Public/show_bug.cgi?id=9263

> 
>           ("Incorrect" is perhaps too strong - but at least it
>           is imprecise.)
> 
> Bug 9264: Provide a way to prevent Content-Language from acting
>           as language fallback
>           http://www.w3.org/Bugs/Public/show_bug.cgi?id=9264

> 
> Related: replies to Addison Phillips [1][2] and to C.E. Whitehead [3].
> 
> [1] http://lists.w3.org/Archives/Public/public-html/2010Mar/0324

> [2] http://lists.w3.org/Archives/Public/public-html/2010Mar/0331

> [3] http://lists.w3.org/Archives/Public/public-html/2010Mar/0325

-- 
leif halvard silli
Received on Thursday, 18 March 2010 23:53:37 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:39:15 UTC