- From: Andrew Cunningham <lang.support@gmail.com>
- Date: Sat, 9 Jul 2016 17:56:31 +1000
- To: Behdad Esfahbod <behdad.esfahbod@gmail.com>
- Cc: www-style <www-style@w3.org>, John Hudson <tiro@tiro.com>
- Message-ID: <CAGJ7U-XoKBEgYwhXjyhpjKCHXguT3mCdCqb2kcJFZGXywAb16w@mail.gmail.com>
Hi Behdad, I am not sure how to respond to your email. I assume you deliberately resent that comment? I have been sitting on this email all day, pondering the contents. my concern is the ability to use lesser used and minority languages on the internet. There are some browsers that are more suited to this than others. John please correct any misconceptions of mine on OpenType fonts. I am aware you believe that language tagging is sufficient and are resistant to implementing font-language-override. Fair enough. Assuming lang / xml:lang is the preferred approach, we need a cross browser approach and normative requirements. Currently each browser does fairly different things with the language tags and how they match up to OT language systems. This is complicated by the fact that some browsers only support opentype while other browsers support additional font technologies. One issue is the limited number of OT language system tags, and what seems to be accidental, haphazard approach to adding them. Maybe it is better to describe OT language system tags in terms of evolution, growing and refining over time. OT language system tags were never developed in a systematic way. And it is probably that over time more will need to be added. So locl support via language tags will always be a moving target. Second issue is the poor mapping between language tags and OT language system tags. Documents like https://www.microsoft.com/typography/otspec/languagetags.htm are far from perfect and require more work. For instance the OT language tag DNK is mapped to the language tag "din". This is a ISO-639-2 tag, and represents a macro-language. In theory all the ISO-639-3 language tags encompassed by it should be listed as well, but they are absent form this table. So DNK should strictly map to din, dip, diw, dib, dks and dik. Which the copy of the OT spec on Microsoft's site does not reflect. Many African languages have specific glyph and diacritic placement requirements that may differ from other languages. Concentrating on DNK for a moment and limiting myself to a discussion of Sudanese and South Sudanese languages of which Dinka is one: most other languages from these countries are not represented as OT language system tags. I only have a partial collection of orthography statements for the languages of these countries, maybe one fifth, maybe less. But going through what I have at hand I identified the following language tags that have similar requirements as Dinka: ava, bfa, bxb, krs, bex, mfz, mor, mgd, mur, nus, lot, ddd, keo and mqu. One option is to greatly expand the number of OT language system tags or alternatively to map these other languages to DNK. John Hudson may correct me, but my understanding is that the OT language system tags were intended to represent shared typographic traditions rather than representing languages per se, so within the context of the OT specifications it would be logical to map va, bfa, bxb, krs, bex, mfz, mor, mgd, mur, nus, lot, ddd, keo and mqu to DNK, rather than adding additional language system tags that essentially would be there to activate exactly teh same typographic features as DNK should. Another similar example to DNK is the language system tag VIT which maps to "vi". But could and maybe should map to all Vietnamese ethnic languages that use the Latin script and share the same typographic traditions and conventions as Vietnamese does. To really use lang or xml:lang to enable locl OT features rather than adding support for font-language-override and t future proof it as much as possible time, money and resources needs to be used to provide as an extensive mapping as possible between bcp47 language tags and OT language system tags. I would consider this work to be out of scope of the CSS WG, and I'd consider it to be out of scope for the OT spec itself. That leaves us with the browser developers to do the work. But there are other issues as well. Some OT language tags should never be associated with bcp47 language tags. One such is KRN which represents the Karen languages (essentially it is a macro language tag) encompassing a number of of languages each with their own language code and in soem cases having incompatible typographic conventions. A few of these languages have their own OT language system tags and these should be used in preference to KRN. John Hudson, in a previous email, listed a number of OT language system tags that do not correspond to any one language. One other situation is where a OT language system tag is ambiguous or problematic since the language tag on a HTML element is not sufficient in and of itself to indicate if the language system should be used.In this case I am thinking of the KHT language system tag. The language tag kht maps to KHT, but there are multiple orthographies for Khamti Shan. The fonts that I have seen that support a locl feature for Khamti are based on what is documented in UTN11 which is based on one of the orthographies in current use. Automatically using KHT for content tagged as kht will work in some cases, but in others give the wrong variants and rendering. So even if you don't support font-language-override you need some other approach to turning on and off locl support. a CSS rule like: :lang(kht) { font-feature-settings: "locl" 0; } should prevent a browser form using the localised features of a font. In this specific case I would expect the rule to prevent the browser form automatically applying the KHT language system, but rather use the default language system, which may be more appropriate for certain Khamti orthographies. Although it doesn't solve the use case of where a SHN langauge system is available and that may be preferred over the default language system. In the absence of font-language-override maybe the browser needs to implement mroe complex logic, ie if kht AND font-feature-settings: "locl" 0 AND SHN use SHN if kht AND font-feature-settings: "locl" 0 use dflt if kht then use KHT else use dflt but that may not support all Khamti orthographies. I keep thinking that font-language-override is the easiest solution to the problem. Automatically using lang / xml:lang to handle locl features is great, I am not against it, I like the feature. But to properly implement it requires a huge amount of work. And honestly I doubt web browser developers would be willing to do the amount of research needed to get it right and support as many languages as possible. The reality is only a small subset of languages will be supported automatically by lang / xml:lang approach. For other languages that leaves either font-language-override or not using the locl feature at all. Andrew
Received on Saturday, 9 July 2016 07:57:01 UTC