- From: Paul Grosso <pgrosso@arbortext.com>
- Date: Wed, 24 Jul 2002 08:44:41 -0500
- To: www-xsl-fo@w3.org
At 02:02 2002 07 24 +0200, Éric Bischoff wrote: >Okay, I've found the number of the RFC that says which code to use. It happens >to be the very same RFC 3066 that XSL-FO specification references! And that my initial message referenced. >_______________________________________________________ > 2. When a language has both an ISO 639-1 2-character code and an ISO > 639-2 3-character code, you MUST use the tag derived from the ISO > 639-1 2-character code. >_______________________________________________________ > >I was wrong when I've said in my previous message that RFC 3066 gives no >preference for one encoding or the other. Sorry for that. > >So the reasoning is unambiguous : >- The specification of XSL-FO relies on RFC 3066 >- RFC 3066 gives the rules for chosing between 2 letters codes and 3 letters >codes (if you have the choice, use 2 letters code) >- So documents conforming to XSL-FO should respect that rule I'm not sure which of the following you are suggesting: 1. that current interpretation of XSL (since it references 3066) requires use of 2 letter codes when available, or 2. you believe that the correct thing for XSL to do is to require use of 2 letter codes when available. I am pretty sure you are against: 3. XSL should only allow use of 3 letter codes (i.e., prohibit use of 2 letter codes as values for the language property) but correct me if I am wrong. >As I've been pointing in my previous message, this rule is idiotic because it >makes a "de facto" mixture of two code sets instead of keeping them separate. >But as I've pointed out too, huge projects (in size) like the KDE project >have chosen to respect that rule. > >Personally, I would however allow some tolerance and accept codes like "deu" >and "ger", even if "de" exists. Okay, so I gather you favor option 2 above, correct? > I would even allow very common constructs not >allowed by RFC 3066 : > > "fr_FR" instead of "fr-FR" > "de-DE@euro" ('@' sign is normally illegal) It is highly unlikely that the XSL spec would sanction use of illegal codes. >> > If a given implementation accepts 2 character values (e.g., "EN"), >> > how are they interpreted (e.g., does "EN" mean US english, >> > British english, or something else)? >> >> I believe that this pecular point is covered by the RFC 3066 which is >> referenced from the XSL-FO specification : >> "en" = English >> "en-GB" = British English >> "en-US" = American English >> >> Same for 3 letters codes: >> "eng" >> "eng-GB" >> "eng-US" >> >> It's independant of the length of the code ;-). First mandatory part is >> language as defined in ISO-639-1 or -2, second optional part part is >> country code as defined in ISO-3166. > >Also, if you ask me "Does 'en' resolve to 'en-GB' or 'en-US'?" I would answer: >it looks like an implementation choice, or could be parametrized. After all, >when we speak about "English", do we refer to "British English" or to >"American English"? There seems to be no easy answer to that question. One >could even imagine hyphenation dictionaries permitting local variants like: > > honor > honour > >-- >Éric Bischoff Thanks for your input. paul
Received on Wednesday, 24 July 2002 09:52:20 UTC