Re: For review: 1 new and 3 updated articles about language declarations in HTML

On 22/08/2011 21:34, Leif Halvard Silli wrote:
>> [3] http://www.w3.org/International/questions/qa-http-and-lang
>
> Here is my long review of the 4th document:
>
> [ Review comment 0: ] Certain details.
>
> * Instead of 'Content-Language', say 'Content-Language:' (with colon)
>    whenever the HTTP header is meant. (But not when the value of the
>    @http-equiv attribute is referred to.)
> * Mark it up like this:<code class="htmL">Content-Language:</code>
> * Standardize the ways to refer to the http-equiv="Content-Language"
>    meta element.
>
>    In that regard: "Content-Language value on a meta element", which
>    is used several times, feels aquard and doesn't really convey very
>    much to the reader, unless he/she *knows* - beforehand - that the
>    @http-equiv attribtue also belongs in this picture and so on.
>
>    Suggestions for referring to it in prose:
>    1) The<meta http-equiv='Content-Language' content='foo' />  element
>    2) The http-equiv='Content-Language' meta element
>    3) The Content-Language value of the @http-equiv attribute of the
>       meta element.
>    4) [If you refer to how to use it:]
>       [Setting] Content-Language with/using the meta element.

I basically standardised on "the meta element with the http-equiv 
attribute set to Content-Language".
>
>
> [ Review comment 1: ]
>
>     ]] HTTP and meta for language information [[
>
> Deconstruction to show that that title doesn't really mean very much:
>
>    1 a) HTTP
>    1 b) and meta
>    2)   for
>    3)   language information
>
> The subsequent question doesn't make it any clearer:
>
>     ]] what are HTTP and meta language declarations for [[
>
> That question seems to be much broader than what you probaly intended
> to ask ... The question would be clearer if it went like so (the markup
> is just a suggestion):
>
>     """ what is the purpose of the language info from the HTTP
>         <code class=HTTP>Content-language:</code>  header as well as
>         the<code class=HTML>http-equiv=Content-Language</code>
>         meta element """
>
>   When it comes to the article's title, then I would suggest:
>
>    """ The HTTP Content-Language: header versus the
>        http-equiv="Content-Language" meta element """


Hmm. I'm not convinced. You suggestions don't actually say what I intend 
(eg. there's no versus implied), and i'm not particularly worried that 
the title is vague.  I'm going to punt on this.

>
> [ Review comment 2: ] The 'background' section
>
> * It is confusing that this section *starts* by showing how to use
> @lang and xml:@lang, when this is *not* the subject of the article. I
> suggest deleting the first paragraph (the para before the example code)
> and instead alter the beginning of the next paragraph, which currently
> begins like this:
>
>     ]] But it is also possible to find [[
>
> And instead let it begin like this:
>
>     """ Apart from the @lang and xml:@lang attribute (see: [link:
> qa-html-language-declarations] ), it is also possible to find ... """


Done.


>
> [ Review comment 3: ] The 'quick answer' section
>
> * say """the HTTP Content-Language: header""". Because: You would
> likewise say 'the HTML div element'. You would *not* say 'the div HTML
> element'.
>
> * say """The Content-Language value of the http-equiv attribute

Done.

>
> Regarding this:
>
>  ]] The Content-Language value on a meta element should no longer be
> used. You should use a language attribute on the html tag to declare
> the primary language of the text in the page.[[
>
> * The phrase "should no longer be used" hangs in the air. It is HTML5
> which forbids it - so this should be said.
> * Also, the second sentence makes it seem as if the purpose of this
> article is not to explain what HTTP Content-Header: is for but rather
> to clear explain the status of the META element: use @lang instead.

The purpose of the article is to explain what *both* http header and 
meta element are for and how to use them (see the question).

What goes for HTML5 is supposed to inform how other flavours of HTML are 
used going forward.  Just because you can use the meta element with 
html4 doesn't make it a good idea to do so, for the same reasons as it 
was dropped from HTML5.

>
> I suggest this text:
>
>      """ In HTML5, the Content-Language value of the @http-equiv
> attribute of the meta element has (along with most other values for
> this attribute) become made obsolete. This means that if you used to
> use @http-equiv to declare the language,  HTML5 now enforces you to get
> rid of that habit and instead use the @lang/xml:@lang attribute on the
> html element - see link to article-such-and-such. Wheras if your
> intention was to declare and use the HTTP Content-Language: header, you
> can consentrate on learning to configure the Web server you use - see
> link to content-negotiation article. """
>
> [ Review comment 4: ] #answer
>
> * This is about the new template (hence it is about all the articles):
> It feels unclear to discern between 'Quick Answer' and 'Answer'. I
> suggest renaming 'Answer' to 'Longer answer', 'In-depth answer' or
> something like that.

Yes. I've been thinking along similar lines, but haven't yet found a 
good formulation.  'In-depth answer' is not bad.  I was thinking about 
'Detailed answer', or 'Further details' or something along those lines.

>
> With regard to the ingress:
>
> ]]
> To begin to answer the question at the top of this page, it is important
> to first draw a distinction between
>     (1) specifying the language used for processing content, and
>     (2) using metadata to identify the audience for the document.
> [[
>
> * Since the topic is HTTP Content-Language:,
> I think meta data should
> be number (1) and processing should be number (2). This is also the
> actual order of the content, when we look at the subsequent headings ...

Done.

>
> * If you use "(1)" and "(2)" then you should repeat those numbers in
> the subsequnt headers. Thus:
>   ""(1) Specifying file metadata: the language of the intended audience""
>
> * Since the subsequet heading talks about 'file metadata' (which is a
> very good an telling expression), it would good to also mention 'file
> metadata' in the ingress (and not only mention 'audience language').

Done.

>
> * The 'to begin' phrase to me is a weasel wording which indicates, as
> well, that this matter is very difficult to understand. I suggest to
> rather say something like this: "The in-dept answer requires an
> understanding of the difference between'

Changed to "Before we answer..."


>
> [ Review comment 5: ] #metadata section
>
> I miss a perspective here: A document might be intended for an audience
> which actually do not consider themselves to use the language of the
> article. E.g. a Serbian language document might be intended for
> speakers of any of the langauge formerly known as Serbo-Croat.
> Currently the text mentions that the document might contain langauges
> that the target audicence might not understand (Chinese for German
> speaking audience). I am concerned about the case when the target
> audience is broader than the document's langauge could be taken to
> hint. In one way, the Canadian French/English article can be said to
> cover my usecase - after all, such a document must use a single
> language tag on the HTML root element ... From the Norwegian context, a
> Bokmål text usually targets Nynorsk users as well (and vice versa) -
> only in the public sphere at state level, were one has the right to
> receive either Bokmål or Nynorsk according to one's preference, would
> one tend to say think 'Bokmål for Bokmål users, Nynorsk for Nynorsk
> users'.

I'm not sure i understand your point, but we're talking about metadata 
here, so if you know that your audience is targeted at specific language 
communities you can list them all in the HTTP header.  (The lang 
attribute has no relevance here, of course.)


>
> [ Review comment 6: ]  #processing section
>
> *Here* it would be good to point out that one should use
> @lang/xml:@lang for this.

Done.

> And to state that this is *not*
> Content-Language's intended purpose.

We do that in the next section, so I don't want to do it here.

>
> [ Review comment 7: ] #meta  section 'Content-Language on the meta
> element'
>
>     * Please move this section *after* the Content-Language: HTTP
> section ... after all, it is HTTP which is recommended to use - META is
> not recommended!
>     * Please let the title say:
>       '[Setting] Content-Language *using* a meta element (not
> recommended)'

Done.

>     * Given that it is not recommended [in HTML5], it woud be good to
> draw
>       attention to this.
>     * You touch the subject of, to which degree, browser are using the
> META element to determine the text processiong lanuage. However, I
> suggest hthat you take that subject *out* of this section (and out of
> the #http section as well) and instead place it in a separate section
> called "Other effects of the Content-Language: header' or something
> like. Because, the thing is that Content-Language: is used by browsers
> regardless of whether it comes from HTTP or from http-equiv. Well,
> http-equiv is used to a larger degree, but nevertheless: it would be
> more fitting to treat this subject at one place, I think.

I'm only making that point to clarify the history related to the meta 
element.  Otherwise it's not actually relevant, since we are telling 
people to use the lang/xml:lang attribute rather than rely on what the 
browsers do with meta/http.

>     * You claim that HTML5 "make a concession for backwards
> compatibility". Does it? I have not recognized that. FIRSTLY, the link
> you have included points to the 'pragma-set default language', but it
> does *not* link to the place where HTML5 explains how the browser does
> *make use* of the pragma set defualt language for determinging the
> language of the document! That explaination is found in the lang
> attribute section: http://www.w3.org/TR/html5/elements.html#language
> And there you can read that
>     ]]
>        If there is no pragma-set default language set, then
>        language information from a higher-level protocol (such as
>        HTTP), if any, must be used as the final fallback language
>        instead.
>     [[
>       Does this mean that HTML5 has made 'concession' to include HTTP
> when determining the language? Please delete this 'concession' talk. It
> does not make sense. THere are no consession. After all: We both took
> part in the process which obsoleted the entire
> http-equiv="Content-Language" meta element! (What you refer to as
> concession is more a realization of how browser reality is.)

Going forward, authors should use the lang/xml:lang attribute to set the 
default text-processing language for the page.  As soon as they do that, 
pragma-set default languages and final fallbacks from http are no longer 
relevant, since the attribute has higher precedence.

As I understand it, the only reason that fallback mechanism was left in 
the spec was as a means of dealing with legacy pages that don't use the 
attribute - which is a concession.  No one should be relying on the use 
of the fallback mechanism in the future - they should be using the 
attribute.


>
> [ Review comment 8: ] #http Content-Language in an HTTP header
>
>     * Move this section before the META element.
>     * Point out that this is recommended over using META
>     * Please let the title say:
>      '[Setting] Content-Language using a HTTP header'
>     * Like I said above: please move the fallback features (the "when no
> other langauge info is present" issues) to a separate section.

See the comments above.
>
> [ Review comment 9: ] Describe the advantages of real HTTP
> Content-Language: headers (over meta element variant)
>
> * I consider that the purpose of the article should be to give info
> about the purpose of the Content-Language: header rather than dwelling
> with side effects of the http-equiv variant.

That was not the intention of the article.

> As such the article should
> tell that Content-Langauge with the META element can not be used for
> anything but its side effect. (In that regard: using it as way for
> human coders to inspect the ocntent language, is a side effect.)
> Whereas a real Content-Language: header can be used for - at least -
> content negotiation *as well as* for its side effects. This is
> currently not mentioned at all.
>
> * However, even without going into negotiation, one can use Apache's
> AddLanguage directive to set the content language(s) with a file
> suffixe in order to sett the HTTP Content-Language: header. Why not
> provide a link to the article where you explain that subject:
> <http://www.w3.org/International/questions/qa-apache-lang-neg>. Though
> -  that page focuses only on *negotiation*. However, it would be
> possible to use file suffixes to provide 'file metadata' that is useful
> also to the author. And, btw: though the Apache docs does not mention
> it (http://httpd.apache.org/docs/2.3/mod/mod_mime.html#addlanguage),
> one can create a file suffix which sets multiple languages at once:
>
> * For that matter - reality check: Webkit don't believe in
> Content-Language: ... https://bugs.webkit.org/show_bug.cgi?id=3510#c27

I think you are confusing Content-Language with Accept-Language.

RI


-- 
Richard Ishida
Internationalization Activity Lead
W3C (World Wide Web Consortium)

http://www.w3.org/International/
http://rishida.net/


Register for the W3C MultilingualWeb Workshop!
Limerick, 21-22 September 2011
http://multilingualweb.eu/register

Received on Friday, 2 September 2011 13:52:26 UTC