- From: Christoph Päper <christoph.paeper@crissov.de>
- Date: Sat, 6 Mar 2010 14:56:36 +0100
- To: www-style list <www-style@w3.org>
Christoph Päper: > Jonathan Kew: >> On 4 Mar 2010, at 13:30, Christoph Päper wrote: >>>> http://dev.w3.org/csswg/css3-fonts/ >>> 6.9 font-lang-sys: normal | inherit | <string> >>> >>> This property currently uses OT language tags which are not really well designed or known as far as I know. It would be better for authors to use [BCP47] and let UAs do the mapping. >> There will be cases where users need to select an alternative "best fit" language system according to what is actually available in their chosen fonts, and it is not possible for browsers to handle this reliably by mapping from BCP47 codes. > > I will send another mail with a quick statistics check on the available language tags. BCP47 / ISO639 vs. OT language tags The conversion table is found at <http://www.microsoft.com/typography/otspec/languagetags.htm>. It lists 472 mappings of 392 OT to 440 ISO codes, if I didn’t miscount. It has 9 entries without ISO 639-3 equivalent(s), 2 are for general phonetic transcriptions, the others are languages (I assume). APPH Phonetic transcription – Americanist conventions IPPH Phonetic transcription – IPA conventions BBR Berber BCR Bible Cree BML Bamileke GAR Garshuni MOR Moroccan NGR Nagari YCR Y-Cree YIC Yi Classic There are ten OT codes with 2, two with 3, one with 4 and two with more (22 and 43) mappings to ISO. Surely there are gaps in the opposite direction which would be more important. Here is the check for ambiguous ISO639 -> OT mapping. Most if not all of them could be solved by specifying a preferred alternative and use of appropriate subtypes (i.e. BCP 47 instead of ISO 639-3). ‘zho’ is used for 4, ‘chp’ and ‘kca’ for 3 and fifteen more for 2 OT codes. zho ZHH Chinese Hong Kong ZHP Chinese Phonetic ZHS Chinese Simplified ZHT Chinese Traditional chp ATH Athapaskan CHP Chipewyan SAY Sayisi kca KHK Khanty-Kazim KHS Khanty-Shurishkar KHV Khanty-Vakhi caf ATH Athapaskan CRR Carrier crm LCR L-Cree MCR Moose Cree crx ATH Athapaskan CRR Carrier csw NCR N-Cree NHC Norway House Cree cwd DCR Woods Cree TCR TH-Cree div DIV Dhivehi DHV Dhivehi (deprecated) ell ELL Greek PGR Polytonic Greek flm HAL Halam QIN Chin gle IRI Irish IRT Irish Traditional kat KAT Georgian KGE Khutsuri Georgian krc BAL Balkar KAR Karachay mal MAL Malayalam Traditional MLR Malayalam Reformed scs ATH Athapaskan SLA Slavey xal KLM Kalmyk TOD Todo xsl ATH Athapaskan SSL South Slavey There’s also a number of tags that mismatch between the standards. I found 34, not counting the ones from the previous list: cmr, alt, bhi, hnd, lub, sot, afr, bal, bcr, bgr, chu, csy, dgr, dng, evn, grn, har, ing, kal, kar, kha, kmb, kuu, lad, man, men, mnk, mon, nyn, rom, sek, swa, tht. ISO tags that are not in the list have not been considered, so the number might be quite a bit higher.
Received on Saturday, 6 March 2010 13:57:09 UTC