what's the language of a document ?

One language vs. languages...

1. suppose you have an HTML5 (HTML4 works too) document instance that
    contains no language information because the language information
    is provided by the HTTP server. For our example, the server is
    serving fully bilingual french/italian documents.

2. the HTTP/1.0 RFC 1945 says the Content-Language header field
    represents "the languages of the intended audience". Excerpt:

      The Content-Language entity-header field describes the natural
      language(s) of the intended audience for the enclosed entity. Note
      that this may not be equivalent to all the languages used within
      the entity.

    In our case, the content-language could then be

      fr,fr-be,fr-ch,fr-ca,it-it,it-ch

    or even a longuer one

3. the user speaks english. He's browsing one page of that server.

What is the language of the document ? Section 3.3.3.3 of HTML5
says "language information from a higher-level protocol (such as HTTP),
if any, must be used as the final fallback language". Ah, ok cool.
So fr, fr-be, fr-ch, fr-ca, it-it or it-ch ?

Do the HTML specs (both 4 and 5) make a confusion here between the
language of the audience and the language of the served instance ?

http://www.ietf.org/rfc/rfc1945.txt

</Daniel>

Received on Thursday, 13 November 2008 14:51:04 UTC