W3C home > Mailing lists > Public > public-html@w3.org > October 2009

RE: what's the language of a document ?

From: CE Whitehead <cewcathar@hotmail.com>
Date: Thu, 29 Oct 2009 14:47:33 -0400
Message-ID: <BLU109-W2302AFF271085AA1A3CCF3B3B70@phx.gbl>
To: <ishida@w3.org>, <ian@hixie.ch>
CC: <simonp@opera.com>, <divya.manian@gmail.com>, <martin.kliehm@namics.com>, <cowan@ccil.org>, <public-html@w3.org>, <www-international@w3.org>, <duerst@it.aoyama.ac.jp>

I personally tend to agree with Roy Fielding, John Cowan, and Tex Texin actually, and not with Martin and Richard Ishida because I regulary create documents in two languages (French-English; French-Old French); following Richard Ishida's recommendations in "Specifying Languages in XHTML and HTML Content," I list all the languages in the meta content tag (when I have access to it; because my documents are generally served from a locale I don't control, I don't have access to the http headers).  I still set the html language to one or the other when possible and then if I get time specify additional information in relevant elements).

 

I think there will always be cases where people will not tag a document correctly; if a tag is needed it makes no sense to eliminate it because someone cannot yet use it properly.  And I think that Tex makes a point too--someone might specify a document language as fr-FR and fr-LU but not fr-CA and it makes no sense to default to unknown.

 

However I'll look at the proposal.

 

Best,

 

C. E. Whitehead 
> From: ishida@w3.org
> To: ian@hixie.ch
> CC: simonp@opera.com; divya.manian@gmail.com; martin.kliehm@namics.com; cowan@ccil.org; public-html@w3.org; www-international@w3.org; duerst@it.aoyama.ac.jp
> Date: Thu, 29 Oct 2009 18:11:27 +0000
> Subject: RE: what's the language of a document ?
> 
> Personally, I agree with Martin here. I have spent a long time trying
> simplify explanations so that people can understand how to manage the
> various different ways of declaring language in HTML (http vs meta vs lang;
> html vs xhtml vs xml), and it really concerns me that I will now have to say
> "But in html5 things are slightly different again". It's already hard
> enough to get people to declare language, and I think that the changes that
> come with the current text in html5 will only make things worse by causing
> further confusion. On the other hand, I think there may be a way to satisfy
> everyone.
> 
> We discussed this during the Internationalization WG telecon last night, and
> I was actioned to put the following to you and the HTML group on behalf of
> the i18n WG.
> 
> 
> Our proposal is as follows and is based on the text of the following
> sections:
> http://www.whatwg.org/specs/web-apps/current-work/multipage/semantics.html#d
> ocument-wide-default-language
> http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#th
> e-lang-and-xml:lang-attributes
> 
> 
> [1] Explain clearly that declarations in the http header and the meta
> element refer to the document as an object, rather than the text in a
> specific element (this is what makes the distinction between single and
> multiple values sensible). 
> 
> [2] Continue to recommend that the document-wide default language be defined
> by a lang attribute on the html tag, but say that if the lang attribute is
> missing and there is a language defined in the http or meta, then those
> language declarations can be used to guess the language of the text, if they
> contain a single value.
> 
> [3] Establish the precedence between http vs meta. 
> 
> [4] Establish the rule that multiple values in the place that has precedence
> equates to lang="".
> 
> This is very close to what we already have, but doesn't try to make the meta
> declaration a different thing than the http declaration, or change it so
> that multiple values are no longer valid. At the same time, it allows
> either the http or the meta to provide language information for
> text-processing, if the declaration is useable.
> 
> We also feel that the spec seems to restrict the use of the term
> 'document-wide default language' to refer only to a language declared using
> the meta, and this is rather odd. We feel that in fact the lang attribute
> on the html element also establishes a document-wide default language. (See
> the text: "Until the pragma is successfully processed, there is no
> document-wide default language.")
> 
> RI
> 
> PS: I could suggest some changes to the wording, if that helps.
> 
> 
> ============
> Richard Ishida
> Internationalization Lead
> W3C (World Wide Web Consortium)
> 
> http://www.w3.org/International/
> http://rishida.net/
> 
> 
> 
> 
> > -----Original Message-----
> > From: www-international-request@w3.org [mailto:www-international-
> > request@w3.org] On Behalf Of "Martin J. Dürst"
> > Sent: 27 October 2009 11:09
> > To: Ian Hickson
> > Cc: Simon Pieters; Divya Manian; Martin Kliehm; John Cowan; <public-
> > html@w3.org>; www-international@w3.org
> > Subject: Re: what's the language of a document ?
> > 
> > On 2009/10/27 19:37, Ian Hickson wrote:
> > > On Tue, 27 Oct 2009, Simon Pieters wrote:
> > >> This doesn't match what's specced for<meta http-equiv=content-
> > language
> > >> content=foo,bar>.
> > >
> > > That's intentional, and is based on data about how people actually use
> > > that pragma.
> > 
> > There's always a way to justify inconsistent choices (be it browser
> > implementations, 'data' about how people (who?) use some feature (at
> > what point in time?),...). But it would be way better to be consistent.
> > 
> > And there is always a way to justify making choices that everybody
> > except those knowing all the details of the spec don't understand. But
> > it would be way better to make choices that are easy to understand (e.g.
> > http-equiv actually meaning what it says, namely "equivalent to the
> > corresponding HTTP header").
> > 
> > There are lots of cases where over time, people have come to a better
> > understanding of how things work. For stuff that authors/producers
> > aren't supposed to produce, I don't mind too much that HTML5 is
> > hopelessly complex and inconsistent. I can live without remembering it
> > all, and can tell others to avoid it. However, for stuff like the above,
> > which may be used even by very consciously clean developers, creating
> > inconsistencies such the above is a heavy negative legacy.
> > 
> > Regards, Martin.
> > 
> > --
> > #-# Martin J. Dürst, Professor, Aoyama Gakuin University
> > #-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp
> 
> 
> 
 		 	   		  
Received on Thursday, 29 October 2009 18:48:14 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:39:09 UTC