Re: [css3-fonts] opentype font feature support

Christoph Päper:
> Jonathan Kew:
>> On 4 Mar 2010, at 13:30, Christoph Päper wrote:
>>>> http://dev.w3.org/csswg/css3-fonts/
>>> 6.9 font-lang-sys: normal | inherit | <string>
>>> 
>>> This property currently uses OT language tags which are not really well designed or known as far as I know. It would be better for authors to use [BCP47] and let UAs do the mapping.
>> There will be cases where users need to select an alternative "best fit" language system according to what is actually available in their chosen fonts, and it is not possible for browsers to handle this reliably by mapping from BCP47 codes.
> 
> I will send another mail with a quick statistics check on the available language tags.

BCP47 / ISO639 vs. OT language tags

The conversion table is found at <http://www.microsoft.com/typography/otspec/languagetags.htm>.
It lists 472 mappings of 392 OT to 440 ISO codes, if I didn’t miscount. It has 9 entries without ISO 639-3 equivalent(s), 2 are for general phonetic transcriptions, the others are languages (I assume).

      APPH  Phonetic transcription – Americanist conventions
      IPPH  Phonetic transcription – IPA conventions
      BBR   Berber
      BCR   Bible Cree
      BML   Bamileke
      GAR   Garshuni
      MOR   Moroccan
      NGR   Nagari
      YCR   Y-Cree
      YIC   Yi Classic

There are ten OT codes with 2, two with 3, one with 4 and two with more (22 and 43) mappings to ISO. Surely there are gaps in the opposite direction which would be more important. Here is the check for ambiguous ISO639 -> OT mapping. Most if not all of them could be solved by specifying a preferred alternative and use of appropriate subtypes (i.e. BCP 47 instead of ISO 639-3). ‘zho’ is used for 4, ‘chp’ and ‘kca’ for 3 and fifteen more for 2 OT codes.

zho   ZHH   Chinese Hong Kong
      ZHP   Chinese Phonetic
      ZHS   Chinese Simplified
      ZHT   Chinese Traditional
chp   ATH   Athapaskan
      CHP   Chipewyan
      SAY   Sayisi
kca   KHK   Khanty-Kazim
      KHS   Khanty-Shurishkar
      KHV   Khanty-Vakhi
caf   ATH   Athapaskan
      CRR   Carrier
crm   LCR   L-Cree
      MCR   Moose Cree
crx   ATH   Athapaskan
      CRR   Carrier
csw   NCR   N-Cree
      NHC   Norway House Cree
cwd   DCR   Woods Cree
      TCR   TH-Cree
div   DIV   Dhivehi
      DHV   Dhivehi (deprecated)
ell   ELL   Greek
      PGR   Polytonic Greek
flm   HAL   Halam
      QIN   Chin
gle   IRI   Irish
      IRT   Irish Traditional
kat   KAT   Georgian
      KGE   Khutsuri Georgian
krc   BAL   Balkar
      KAR   Karachay
mal   MAL   Malayalam Traditional
      MLR   Malayalam Reformed
scs   ATH   Athapaskan
      SLA   Slavey
xal   KLM   Kalmyk
      TOD   Todo
xsl   ATH   Athapaskan
      SSL   South Slavey

There’s also a number of tags that mismatch between the standards. I found 34, not counting the ones from the previous list: cmr, alt, bhi, hnd, lub, sot, afr, bal, bcr, bgr, chu, csy, dgr, dng, evn, grn, har, ing, kal, kar, kha, kmb, kuu, lad, man, men, mnk, mon, nyn, rom, sek, swa, tht. ISO tags that are not in the list have not been considered, so the number might be quite a bit higher.

Received on Saturday, 6 March 2010 13:57:09 UTC