Re: For review: 1 new and 3 updated articles about language declarations in HTML

Richard Ishida, Thu, 18 Aug 2011 16:47:20 +0100:

> 4    HTTP and meta for language information
>      
http://www.w3.org/International/tutorials/new-language-decl/qa-http-and-lang

> 
>     This is a reworking of an existing article [3] to reflect recent 
> developments in HTML5 and improve the fit with other pages listed 
> here. It will replace the old version.

> [3] http://www.w3.org/International/questions/qa-http-and-lang


Here is my long review of the 4th document:

[ Review comment 0: ] Certain details.

* Instead of 'Content-Language', say 'Content-Language:' (with colon) 
  whenever the HTTP header is meant. (But not when the value of the
  @http-equiv attribute is referred to.)
* Mark it up like this: <code class="htmL">Content-Language:</code>
* Standardize the ways to refer to the http-equiv="Content-Language" 
  meta element. 

  In that regard: "Content-Language value on a meta element", which
  is used several times, feels aquard and doesn't really convey very
  much to the reader, unless he/she *knows* - beforehand - that the
  @http-equiv attribtue also belongs in this picture and so on.

  Suggestions for referring to it in prose:
  1) The <meta http-equiv='Content-Language' content='foo' /> element
  2) The http-equiv='Content-Language' meta element
  3) The Content-Language value of the @http-equiv attribute of the
     meta element.
  4) [If you refer to how to use it:]
     [Setting] Content-Language with/using the meta element.


[ Review comment 1: ]

   ]] HTTP and meta for language information [[

Deconstruction to show that that title doesn't really mean very much:

  1 a) HTTP
  1 b) and meta
  2)   for
  3)   language information
  
The subsequent question doesn't make it any clearer:

   ]] what are HTTP and meta language declarations for [[

That question seems to be much broader than what you probaly intended 
to ask ... The question would be clearer if it went like so (the markup 
is just a suggestion):

   """ what is the purpose of the language info from the HTTP 
       <code class=HTTP>Content-language:</code> header as well as
       the <code class=HTML>http-equiv=Content-Language</code>
       meta element """

 When it comes to the article's title, then I would suggest:

  """ The HTTP Content-Language: header versus the 
      http-equiv="Content-Language" meta element """

[ Review comment 2: ] The 'background' section

* It is confusing that this section *starts* by showing how to use 
@lang and xml:@lang, when this is *not* the subject of the article. I 
suggest deleting the first paragraph (the para before the example code) 
and instead alter the beginning of the next paragraph, which currently 
begins like this:

   ]] But it is also possible to find [[

And instead let it begin like this:

   """ Apart from the @lang and xml:@lang attribute (see: [link: 
qa-html-language-declarations] ), it is also possible to find ... """

[ Review comment 3: ] The 'quick answer' section

* say """the HTTP Content-Language: header""". Because: You would 
likewise say 'the HTML div element'. You would *not* say 'the div HTML 
element'.

* say """The Content-Language value of the http-equiv attribute 

Regarding this:

 ]] The Content-Language value on a meta element should no longer be 
used. You should use a language attribute on the html tag to declare 
the primary language of the text in the page.[[

* The phrase "should no longer be used" hangs in the air. It is HTML5 
which forbids it - so this should be said. 
* Also, the second sentence makes it seem as if the purpose of this 
article is not to explain what HTTP Content-Header: is for but rather 
to clear explain the status of the META element: use @lang instead.

I suggest this text:

    """ In HTML5, the Content-Language value of the @http-equiv 
attribute of the meta element has (along with most other values for 
this attribute) become made obsolete. This means that if you used to 
use @http-equiv to declare the language,  HTML5 now enforces you to get 
rid of that habit and instead use the @lang/xml:@lang attribute on the 
html element - see link to article-such-and-such. Wheras if your 
intention was to declare and use the HTTP Content-Language: header, you 
can consentrate on learning to configure the Web server you use - see 
link to content-negotiation article. """

[ Review comment 4: ] #answer

* This is about the new template (hence it is about all the articles): 
It feels unclear to discern between 'Quick Answer' and 'Answer'. I 
suggest renaming 'Answer' to 'Longer answer', 'In-depth answer' or 
something like that.

With regard to the ingress:

]]
To begin to answer the question at the top of this page, it is important
to first draw a distinction between 
   (1) specifying the language used for processing content, and 
   (2) using metadata to identify the audience for the document.
[[

* Since the topic is HTTP Content-Language:, I think meta data should 
be number (1) and processing should be number (2). This is also the 
actual order of the content, when we look at the subsequent headings ...

* If you use "(1)" and "(2)" then you should repeat those numbers in 
the subsequnt headers. Thus:
 ""(1) Specifying file metadata: the language of the intended audience""

* Since the subsequet heading talks about 'file metadata' (which is a 
very good an telling expression), it would good to also mention 'file 
metadata' in the ingress (and not only mention 'audience language').

* The 'to begin' phrase to me is a weasel wording which indicates, as 
well, that this matter is very difficult to understand. I suggest to 
rather say something like this: "The in-dept answer requires an 
understanding of the difference between'

[ Review comment 5: ] #metadata section

I miss a perspective here: A document might be intended for an audience 
which actually do not consider themselves to use the language of the 
article. E.g. a Serbian language document might be intended for 
speakers of any of the langauge formerly known as Serbo-Croat. 
Currently the text mentions that the document might contain langauges 
that the target audicence might not understand (Chinese for German 
speaking audience). I am concerned about the case when the target 
audience is broader than the document's langauge could be taken to 
hint. In one way, the Canadian French/English article can be said to 
cover my usecase - after all, such a document must use a single 
language tag on the HTML root element ... From the Norwegian context, a 
Bokmål text usually targets Nynorsk users as well (and vice versa) - 
only in the public sphere at state level, were one has the right to 
receive either Bokmål or Nynorsk according to one's preference, would 
one tend to say think 'Bokmål for Bokmål users, Nynorsk for Nynorsk 
users'. 

[ Review comment 6: ]  #processing section

*Here* it would be good to point out that one should use 
@lang/xml:@lang for this. And to state that this is *not* 
Content-Language's intended purpose.

[ Review comment 7: ] #meta  section 'Content-Language on the meta 
element'

   * Please move this section *after* the Content-Language: HTTP 
section ... after all, it is HTTP which is recommended to use - META is 
not recommended!
   * Please let the title say: 
     '[Setting] Content-Language *using* a meta element (not 
recommended)'
   * Given that it is not recommended [in HTML5], it woud be good to 
draw
     attention to this.
   * You touch the subject of, to which degree, browser are using the 
META element to determine the text processiong lanuage. However, I 
suggest hthat you take that subject *out* of this section (and out of 
the #http section as well) and instead place it in a separate section 
called "Other effects of the Content-Language: header' or something 
like. Because, the thing is that Content-Language: is used by browsers 
regardless of whether it comes from HTTP or from http-equiv. Well, 
http-equiv is used to a larger degree, but nevertheless: it would be 
more fitting to treat this subject at one place, I think.
   * You claim that HTML5 "make a concession for backwards 
compatibility". Does it? I have not recognized that. FIRSTLY, the link 
you have included points to the 'pragma-set default language', but it 
does *not* link to the place where HTML5 explains how the browser does 
*make use* of the pragma set defualt language for determinging the 
language of the document! That explaination is found in the lang 
attribute section: http://www.w3.org/TR/html5/elements.html#language 
And there you can read that 
   ]]
      If there is no pragma-set default language set, then 
      language information from a higher-level protocol (such as
      HTTP), if any, must be used as the final fallback language
      instead.
   [[
     Does this mean that HTML5 has made 'concession' to include HTTP 
when determining the language? Please delete this 'concession' talk. It 
does not make sense. THere are no consession. After all: We both took 
part in the process which obsoleted the entire 
http-equiv="Content-Language" meta element! (What you refer to as 
concession is more a realization of how browser reality is.)

[ Review comment 8: ] #http Content-Language in an HTTP header

   * Move this section before the META element.
   * Point out that this is recommended over using META
   * Please let the title say: 
    '[Setting] Content-Language using a HTTP header'
   * Like I said above: please move the fallback features (the "when no 
other langauge info is present" issues) to a separate section.
   
[ Review comment 9: ] Describe the advantages of real HTTP 
Content-Language: headers (over meta element variant)

* I consider that the purpose of the article should be to give info 
about the purpose of the Content-Language: header rather than dwelling 
with side effects of the http-equiv variant. As such the article should 
tell that Content-Langauge with the META element can not be used for 
anything but its side effect. (In that regard: using it as way for 
human coders to inspect the ocntent language, is a side effect.)  
Whereas a real Content-Language: header can be used for - at least - 
content negotiation *as well as* for its side effects. This is 
currently not mentioned at all.

* However, even without going into negotiation, one can use Apache's 
AddLanguage directive to set the content language(s) with a file 
suffixe in order to sett the HTTP Content-Language: header. Why not 
provide a link to the article where you explain that subject: 
<http://www.w3.org/International/questions/qa-apache-lang-neg>. Though 
-  that page focuses only on *negotiation*. However, it would be 
possible to use file suffixes to provide 'file metadata' that is useful 
also to the author. And, btw: though the Apache docs does not mention 
it (http://httpd.apache.org/docs/2.3/mod/mod_mime.html#addlanguage), 
one can create a file suffix which sets multiple languages at once:

* For that matter - reality check: Webkit don't believe in 
Content-Language: ... https://bugs.webkit.org/show_bug.cgi?id=3510#c27

-- 
Leif H Silli

Received on Monday, 22 August 2011 20:35:12 UTC