W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > July 2012

ACTION-161 "Talk to shaun about BCP47 compatibility"

From: Felix Sasaki <fsasaki@w3.org>
Date: Sun, 8 Jul 2012 18:38:02 +0200
Message-ID: <CAL58czqu=qiknv597b181DZUYFNA6jpRoozY4B0K7qqQ=fykGg@mail.gmail.com>
To: Shaun McCance <shaunm@gnome.org>
Cc: public-multilingualweb-lt@w3.org
Hi Shaun,

with
http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0010.html
as a basis, at
http://www.w3.org/2012/07/05-mlw-lt-minutes.html#item13
we discussed autoLanguageProcessingRule.

One aspect that came up was whether this should be specific to
transliteration - Yves mentioned that you have implemented this not only
for transliteration, but also for machine translation.

That leads to the question what the relation to BCP 47 "t" extension should
be. See as an input the RFC for the "t" extension
http://tools.ietf.org/html/rfc6497
which has transliteration as an example
und-Latn-t-und-cyrl

and the discussion at
http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jun/0155.html
(

>> 5) WRT to the tags that Mark mentioned in 1. below: are the "transform"
>> XML files here
>> http://unicode.org/cldr/trac/browser/tags/release-21-0-2/common/bcp47 the

)
This discussion showed that the fields for the "t" extension include also
values for machine translation, see
http://unicode.org/cldr/trac/browser/tags/release-21-0-2/common/bcp47/transform_mt.xml
[

<key extension="t" name="t0" description="Machine Translation:

8                 Used to indicate content that has been machine
translated, or a request for a particular type of machine translation of
content.

9                 The first subfield in a sequence would typically be a
'platform' or vendor designation." since="21.0.2">

10                   <type name="und" description="The choice of machine
translation is not specified. Used when the only information known (or
requested) is that the text was machine translated." since="21.0.2" />
]

For other "transform" fields, see
http://unicode.org/cldr/trac/browser/tags/release-21-0-2/common/bcp47/transform.xml
We now want to make sure that - if we provide a data category
"autoLanguageProcessingRule" - that this is somehow consistent with the BCP
47 approach, or that at least we have a good story why it doesn't need to
be consistent. Do you have any thoughts about this?

Looking very much forward to your feedback,

Felix

-- 
Felix Sasaki
DFKI / W3C Fellow
Received on Sunday, 8 July 2012 16:38:29 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:31:47 UTC