Re: what's the language of a document ?

On Thu, 13 Nov 2008, Daniel Glazman wrote:
> 
> One language vs. languages...
> 
> 1. suppose you have an HTML5 (HTML4 works too) document instance that
>    contains no language information because the language information
>    is provided by the HTTP server. For our example, the server is
>    serving fully bilingual french/italian documents.
> 
> 2. the HTTP/1.0 RFC 1945 says the Content-Language header field
>    represents "the languages of the intended audience". Excerpt:
> 
>      The Content-Language entity-header field describes the natural
>      language(s) of the intended audience for the enclosed entity. Note
>      that this may not be equivalent to all the languages used within
>      the entity.
> 
>    In our case, the content-language could then be
> 
>      fr,fr-be,fr-ch,fr-ca,it-it,it-ch
> 
>    or even a longuer one
> 
> 3. the user speaks english. He's browsing one page of that server.
> 
> What is the language of the document?

The unknown language.


> Section 3.3.3.3 of HTML5 says "language information from a higher-level 
> protocol (such as HTTP), if any, must be used as the final fallback 
> language". Ah, ok cool. So fr, fr-be, fr-ch, fr-ca, it-it or it-ch ?

As you point out, the Content-Language header in HTTP isn't this 
information. Note, however, that based on legacy content practices, the 
<meta name="Content-Language"> element _does_ set the language:

   http://www.whatwg.org/specs/web-apps/current-work/#document-wide-default-language

(Some people have complained about this and this might still change.)

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Thursday, 13 November 2008 18:21:02 UTC