W3C home > Mailing lists > Public > www-international@w3.org > October to December 2012

RE: Summary of I18N discussion in HTML WG today

From: Phillips, Addison <addison@lab126.com>
Date: Mon, 19 Nov 2012 11:08:56 -0800
To: Cameron Jones <cmhjones@gmail.com>, Richard Ishida <ishida@w3.org>
CC: Mark Davis ☕ <mark@macchiato.com>, Robin Berjon <robin@w3.org>, "HTML WG (public-html@w3.org)" <public-html@w3.org>, "www-international@w3.org" <www-international@w3.org>
Message-ID: <131F80DEA635F044946897AFDA9AC34773A88981F5@EX-SEA31-D.ant.amazon.com>
Hello Cameron,

The point i was making during the F2F is that valid BCP-47 requires the preceeding language identifier before any extension attributes. Using just an extension attribute is not valid BCP-47, it would not conform to the parsing rules.

AP> You’re correct. However, you probably want the other information in the language tag when formatting a date control anyway. Setting the calendar to (for example) the Islamic calendar is just one bit of information you need to achieve correct formatting. You also need the language to use, the specific regional customs (first day of week, abbreviation formatting, week numbering, etc.) to use.

AP> Perhaps this would be a reason to change from calling it @calendar to something slightly more generic, like, say, @format?

BCP-47 does not limit calendars to certain languages, so you can use islamic calendar with any lanugage set. In that regard BCP-47 has complete flexibility with regard to all localization identification.

AP> Richard’s point is that @lang is supposed to declare the language of the content and it applies to the whole of the element. What we are doing here is using an attribute to cause the user-agent to format some content for us and this might apply only to the one specific thing (the presentation of the calendar itself).

AP> When Mark and I started our work on the current BCP 47, one of our precepts was that a language tag and a locale identifier are the same thing and want to be interchangeable. The addition of the Unicode Locale extension allows language tags full expressiveness in this regard and, when this issue was first raised, my reaction was along the lines of: “the language attribute is a perfectly effective vehicle for conveying both the language of static text in an HTML document and for conveying the locale to use when formatting content inserted into that document.”

AP> When we discussed this in the WG call, though, Richard made the point that he is making in this thread: that there may be times when you want to separate a particular static text declaration from the locale formatting used in the control.

AP> Richard goes on to make the point that, in the absence of an @calendar value, the @lang ought to control the format of the calendar. That is, each of these three fragments has the same result for the calendar (assuming no intervening @lang values):

<html lang=”de-u-ca-islamic”>

<input lang=”de-u-ca-islamic">

<input calendar=”de-u-ca-islamic”>

AP> … and, as Richard points out, that allows for the @lang to have one value while the @calendar does something else:

<input type=date lang=de calendar=ar-u-ca-islamic title=”Abfahrtsdatum“><!-- Arabic calendar with German title -->

AP> A good analogy for what Richard, I think, is looking for is the proposal we had about being able to have an ‘attrdir’ to set the base direction of an attribute separately from that of the element. In that case, you want to be able to do something like:

<p dir=rtl alt=”english explanation here” attrdir=ltr>HEBREW TEXT HERE IS RTL.</p>

AP> … in which the direction of the attribute is different from the direction of the element’s contents. It is a rare case, but equivalent. (It’s such a rare case that we didn’t end up adding @attrdir to HTML). Question is whether it is appropriate for language/locale.

I do not regard using the BCP-47 extensions as overloading, it is utilization.

AP> I think Richard meant that the @lang attribute’s meaning is overloaded, not that BCP 47 was being overloaded. I don’t happen to agree with him that it is overloaded, only that the nature of @lang (like @dir) is that it also affects enclosed elements/attributes and that this is occasionally a problem (one @lang with many targets).

AP> The missing counter argument goes something like: “what we need is a new general attribute @locale that can be applied in forms to format content and which has scope/behaves like @lang”. I object to that because @lang already exists and is suitable for that job. Adding @locale would be confusing and pointless. It is rare to want the language of the document and the locale of document’s locally applied formatting to be different: usually it is wrong when that happens.

That said, I would happily agree that, if information from a calendar attribute was unavailable, the browser could *guess* the locale from information provided by a lang attribute.  But that would just be a fallback.


The most useful aspect of using the standard @lang attribute is that the resolution process is defined within context of the elements, pragma, and HTTP header. This allows for localization information such as calendar to be set once on the most suitable level and with customary defaults.

AP> I agree. I just don’t think that what Richard suggests is inconsistent with this: it allows setting the “locale” of the document at the right level (using @lang), with a specific override on the control that wants it. The question is whether it adds needless complexity to the markup or would be confusing.


Addison Phillips
Globalization Architect (Lab126)
Chair (W3C I18N WG)

Internationalization is not a feature.
It is an architecture.

Received on Monday, 19 November 2012 19:18:08 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 19 November 2012 19:18:09 GMT