Two approaches for LTLI document from Felix Sasaki on 2006-06-07 (public-i18n-core@w3.org from April to June 2006)

From: Felix Sasaki <fsasaki@w3.org>
Date: Wed, 07 Jun 2006 11:01:05 +0900
To: "public-i18n-core@w3.org" <public-i18n-core@w3.org>
Message-ID: <44863361.6060206@w3.org>
Below is a mail from Mark about the current draft of LTLI, see
http://www.w3.org/International/core/langtags/

Mark had send the mail to the member list first, see
http://lists.w3.org/Archives/Member/member-i18n-core/2006Jun/0003.html .
I am sending it to the public list, to foster public discussion.

Regards,

Felix

----------------------------------------------------------------

Some comments on http://www.w3.org/International/core/langtags/

General. There are two possible approaches the document could take, that
are currently kinda mixed together.

A. Languages and locales are different entities. While the same syntax
and values can be used to identify both (according to 3066bis), the
semantics are somewhat different. Thus protocols should have two
different fields: one for language and one for locale.

B. Languages and locales are different entities. While the same syntax
and values can be used to identify both (according to 3066bis), the
semantics are somewhat different. In practice, however, the same field
in a protocol can be and is typically used for both purposes. Protocols
requiring more information about locales than can be conveyed with the
language tag values should have an additional locale field.

We really need to decide which path we want to follow. If it is A, then
there need to be some changes like an addition to 4.1 explaining when
and why two fields are necessary, and a conformance clause in 5 saying
that protocols SHOULD have two fields. If B (my preference), then 1.4
needs to be expanded to clarify the issues and that the interpretation
of a "joint" field as a language may thus be somewhat overspecified;
thus one might have ja-US with a specific region, even though ja-US and
ja-JP may be essentially the same.
====

    The current best practice when developing specifications for
language identification is to refer to [RFC 3066] , using a formulation
like "RFC 3066 or its successor". Recently a successor for [RFC 3066]
has been developed, called [RFC 3066bis]. This specification takes [RFC
3066bis] as the basis for language identification, and [RFC 3066bis
Matching] as the basis for matching of language identifiers ("tags").

    The current practice in many standards is to identify language in
terms of [RFC 3066], using formulations like " RFC 3066 or its
successor". Recently a successor for [RFC 3066] has been developed,
called [RFC 3066bis]. This specification takes [RFC 3066bis] as the
basis for language identification, and [RFC 3066bis Matching] as the
basis for matching of language tags.

[redundant, combine]

====

General: replace "Locale Values" with "Locale Identifiers"

====

    [RFC 3066bis] refers to language identification only. Locales can be
identified in several ways. One method is by inference from language
tags. For example, an implementation could map a language tag from an
existing protocol, such as HTTP's Accept-Language header, to its locale
model. Locales may also be identified directly by using the language tag
syntax in data items (elements, attributes, headers, etc.) that
explicitly serve the purpose of locale identification.

[This doesn't belong in the scope section, and doesn't clearly
distinguish between locales. Nor do we anywhere explicitly say that the
locale identifiers as defined in this document are syntactically -- and
in value, though not interpretation -- identical to language tags as
defined in 3066&succ, not until we get way down to 4.1. That should be
said early.]
=>

There are many different systems of locale identification. One method of
locale identification is to use the language tag values defined in [RFC
3066bis], although with a somewhat different semantic: that is the
method described in this document.

[RFC 3066bis] only refers to language identification, not locale
identification, and is used as such in existing protocols, such as
HTTP's Accept-Language header. However, it is quite common practice for
implementations to use the language tag fields in protocols as a
surrogate for a locale identifier, and thus infer a locale from the
language tag field. There are some cases where this approach is not
optimal, outlined in section 4.1.

[then other changes, according to whether we go with A or B]

Mark
Received on Wednesday, 7 June 2006 02:01:14 UTC