- From: Paul Grosso <pgrosso@arbortext.com>
- Date: Wed, 24 Jul 2002 08:44:41 -0500
- To: www-xsl-fo@w3.org
At 02:02 2002 07 24 +0200, Éric Bischoff wrote:
>Okay, I've found the number of the RFC that says which code to use. It happens
>to be the very same RFC 3066 that XSL-FO specification references!
And that my initial message referenced.
>_______________________________________________________
> 2. When a language has both an ISO 639-1 2-character code and an ISO
> 639-2 3-character code, you MUST use the tag derived from the ISO
> 639-1 2-character code.
>_______________________________________________________
>
>I was wrong when I've said in my previous message that RFC 3066 gives no
>preference for one encoding or the other. Sorry for that.
>
>So the reasoning is unambiguous :
>- The specification of XSL-FO relies on RFC 3066
>- RFC 3066 gives the rules for chosing between 2 letters codes and 3 letters
>codes (if you have the choice, use 2 letters code)
>- So documents conforming to XSL-FO should respect that rule
I'm not sure which of the following you are suggesting:
1. that current interpretation of XSL (since it references 3066) requires
use of 2 letter codes when available, or
2. you believe that the correct thing for XSL to do is to require use of
2 letter codes when available.
I am pretty sure you are against:
3. XSL should only allow use of 3 letter codes (i.e., prohibit use of
2 letter codes as values for the language property)
but correct me if I am wrong.
>As I've been pointing in my previous message, this rule is idiotic because it
>makes a "de facto" mixture of two code sets instead of keeping them separate.
>But as I've pointed out too, huge projects (in size) like the KDE project
>have chosen to respect that rule.
>
>Personally, I would however allow some tolerance and accept codes like "deu"
>and "ger", even if "de" exists.
Okay, so I gather you favor option 2 above, correct?
> I would even allow very common constructs not
>allowed by RFC 3066 :
>
> "fr_FR" instead of "fr-FR"
> "de-DE@euro" ('@' sign is normally illegal)
It is highly unlikely that the XSL spec would sanction use of
illegal codes.
>> > If a given implementation accepts 2 character values (e.g., "EN"),
>> > how are they interpreted (e.g., does "EN" mean US english,
>> > British english, or something else)?
>>
>> I believe that this pecular point is covered by the RFC 3066 which is
>> referenced from the XSL-FO specification :
>> "en" = English
>> "en-GB" = British English
>> "en-US" = American English
>>
>> Same for 3 letters codes:
>> "eng"
>> "eng-GB"
>> "eng-US"
>>
>> It's independant of the length of the code ;-). First mandatory part is
>> language as defined in ISO-639-1 or -2, second optional part part is
>> country code as defined in ISO-3166.
>
>Also, if you ask me "Does 'en' resolve to 'en-GB' or 'en-US'?" I would answer:
>it looks like an implementation choice, or could be parametrized. After all,
>when we speak about "English", do we refer to "British English" or to
>"American English"? There seems to be no easy answer to that question. One
>could even imagine hyphenation dictionaries permitting local variants like:
>
> honor
> honour
>
>--
>Éric Bischoff
Thanks for your input.
paul
Received on Wednesday, 24 July 2002 09:52:20 UTC