W3C home > Mailing lists > Public > www-international@w3.org > January to March 2010

Re: ISSUE-88 / Re: what's the language of a document ?

From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Date: Thu, 11 Mar 2010 15:28:22 +0900
Message-ID: <4B988D86.8070405@it.aoyama.ac.jp>
To: Ian Hickson <ian@hixie.ch>
CC: Richard Ishida <ishida@w3.org>, www-international@w3.org, public-html@w3.org, "'Maciej Stachowiak'" <mjs@apple.com>, "Roy T. Fielding" <fielding@apache.org>
Regarding this issue, I'd suggest everybody to go back and read
http://lists.w3.org/Archives/Public/www-international/2010JanMar/0090.html 
and 
http://lists.w3.org/Archives/Public/www-international/2009OctDec/0025.html.

The Web is composed of servers and browsers (and proxies and other 
stuff). A good Web standard (as I hope HTML5 is on the way to become) 
has to consider all sides (not just browsers).

Also, the Web is composed of all kinds of content, from ad-hoc, 
once-made, then forgotten, to very well designed, organized, and 
administrated (and all kinds of other stuff inbetween). Again, while 
it's very good to make sure the former is well-covered (and HTML5 does a 
very good job here, where XHTML didn't), it would be a very bad idea to 
throw out support for the later kind of content even if the former kind 
of content may be in the majority in some statistics.


<meta http-equiv=... and in particular <meta 
http-equiv="Content-Language" is for servers, not clients. The fact that 
some pages (indeed maybe a majority of pages) contain incomplete (or 
wrong) information is a result of the fact that not all servers (in fact 
probably only a minority of servers, although the situation in 
well-controlled intranets may be somewhat different from the visible 
Internet) make use of this information. That's unfortunate, but it 
should not lead to a redefinition of this field that makes it impossible 
for servers which followed the standard up to now to continue to use it, 
even if that's only a minority of servers. These servers have very good 
reasons to use this field, be it for content management, for generating 
HTTP headers, for content negotiation, for managing translation 
workflow, or for some other server-side activity.

As a result, the HTML5 spec best should just say that <meta http-equiv 
is used primarily as meta-information on the server side and is 
therefore in general ignored on the client side. This would also apply 
to Content-Language. Of course, there are well-known exceptions, such as 
the charset part of Content-Type. As far as these are really used widely 
on the browser side and don't conflict with server-side use, they should 
be mentioned in HTML5.

Regards,    Martin.

On 2010/03/11 10:14, Ian Hickson wrote:
> On Wed, 24 Feb 2010, Richard Ishida wrote:
>>
>> It's significant that the thing we're calling the pragma is a use of a
>> <meta>  element.
>
> It's the<meta>  element for historical reasons. I don't think it's
> particularly significant to the discussion at hand,
>
>
>> It's metadata, and the view of the i18n WG is that it should be
>> available for use to specify metadata if you need to do so *in the
>> document*.
>
> If you need to specify the language, the lang="" attribute seems to
> provide a significantly better solution than this pragma. If it wasn't for
> compatibility with legacy content, I think the pragma would be best
> removed from the language altogether.
>
>
>> It's true that a lot of people misunderstood the use of this pragma in
>> the past, but that's what we're trying to clarify here (and btw I've
>> seen evidence that that is changing).
>
> Can you share this evidence? If people really are learning how to use this
> pragma, that changes matters significantly.
>
>
>> The i18n WG agrees that authors should be discouraged from using the
>> pragma for the purposes that the lang attribute should be used, but we
>> are also saying that, its use should be *encouraged* for cases where you
>> want to specify metadata inside the document.
>
> Could you elaborate on what use cases this would be intended to address? I
> don't understand why authors would want to do this.
>
> Also, note that using http-equiv is not setting metadata. It's setting
> pragma directives for the user agent. If there is a solid use case here
> for document-wide metadata concerning languages, we can certainly handle
> it, but it would be best to handle it using the dedicated metadata
> mechanisms (<meta name>, microdata, RDFa, a dedicated attribute like
> lang="", or some other such mechanism.)
>
>
>> And if you are using this to specify metadata, you must allow for
>> multiple values.  What's more, changing the syntax of the pragma to
>> accept only one language is likely to only further confuse people, in
>> the opinion of the i18n WG, since it now appears to be more like the
>> lang attribute, and in addition, the behaviour is different to previous
>> versions of HTML, which further complicates explanations about how to
>> handle language in HTML.
>
> Previous versions of HTML did not match reality. As such, I don't think
> they're really relevant here.
>
> Reality is that the http-equiv="Content-Language" value is handled more or
> less as defined in HTML5. It does not provide metadata; it can't handle
> multiple values. When supported at all, it just sets the default for the
> lang="" attribute.
>
>
>> In addition, we are worried about the effect on legacy data of changing
>> the number of allowed language values for this meta element.  There may
>> not be much out there, but there may also be some, and we felt that this
>> is inconsistent with the efforts of the html folks to maintain backwards
>> compatibility in other areas.
>
> The goal of maintaining backwards compatibility in this case is exactly
> why multiple languages were dropped and why the meaning of this pragma was
> changed from the previous definition to the definition that matched actual
> usage and implementation.
>
>
>> This was why we wanted to talk with you at TPAC and go through the
>> proposals in
>> http://lists.w3.org/Archives/Public/public-html/2009Oct/1086.html (on
>> which the Change Proposal is based), and we left that meeting
>> understanding that you had agreed to the proposals.
>
> I thought I'd made the changes we agreed to -- apparently we didn't
> understand each other at that meeting!
>
>
>>> I recommend going through the normal process for these, by the way
>>> (using bugs and so forth) rather than jumping straight to the Change
>>> Proposal stage. It will help ensure that we keep issues focused.
>>
>> Actually we have been following the process.  Here is the original bug
>> report http://www.w3.org/Bugs/Public/show_bug.cgi?id=8088 which you
>> rejected.
>
> That bug has a much narrower focus than some of the changes you have
> proposed, as far as I can tell.
>

-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp
Received on Thursday, 11 March 2010 06:29:27 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 11 March 2010 06:29:28 GMT