- From: Peter Constable <petercon@microsoft.com>
- Date: Fri, 30 Oct 2009 22:53:34 +0000
- To: Andrew Cunningham <lang.support@gmail.com>, Martin J. Dürst <duerst@it.aoyama.ac.jp>
- CC: Jonathan Kew <jonathan@jfkew.plus.com>, www-font <www-font@w3.org>, Håkon Wium Lie <howcome@opera.com>, www-style <www-style@w3.org>, Stephen Zilles <szilles@adobe.com>, LTRU Working Group <ltru@ietf.org>, "Adam Twardoch (List)" <list.adam@twardoch.com>
- Message-ID: <BF2262AF099A70419F68A17FF6338DF40440B8B6@TK5EX14MBXC141.redmond.corp.microsoft.>
Indeed, the OT language system tags are about typographic conventions. Now, many languages have a single conventional writing system and a single set of conventions for typography for that writing system, though conventions may different for other languages with writing systems based on the same script. A familiar example is Serbian, which has distinctive italic forms for certain letters making its typographic conventions different from, say, Russian. There may also be cases within a single language and writing system of multiple typographic conventions. For instance, Malayalam is taught today using fewer conjunct forms than were used in the past. In principle, it may be reasonable to say that two or more languages can be described as having a common set of typographic conventions. For instance, I don’t know of any particular reason why font rendering should differ according to whether the text is English or Spanish. In practice, though, it would be very difficult to manage a system that organized the data that way: it’s far easier to allow for a default OpenType language system tag for every language, keeping in mind that OpenType fonts have a default language system and that an explicit language system only needs to be incorporated into font data and invoked when rendering if there are distinctive typographic behaviours. Btw, I was the one that added the ISO 639 data to the table in the OT tag registry. You’ll see that there are exceptions to what I just said: some of the OT lang system tags are associated with multiple ISO 639 codes, hence multiple languages. The list of OT lang system tags were a given: someone populated the registry with many entries several years ago, adding entries before there were known needs and without first explicitly adopting and declaring a conventional set of language identities – ISO 639-3 didn’t exist then, so it wasn’t an option. Moving forward several years, many of the OT lang system tags had become a bit of a mystery, and so I felt it made sense to introduce mappings to ISO 639 in order to establish a sensible meaning for as many as possible. When faced with a lang system tag such as ATH (Athapaskan), having no documentation of what the submitter for that tag had in mind, the best alternative I had was to associate that with all Athapaskan languages. That’s all background on the conceptual issue Andrew raised. Let me comment on what is meant by “Chinese Phonetic”, and on Martin’s comments generally, starting with the latter. Martin suggested that the info Adam Twardoch reported (on some other list than this) should be revised. Adam was merely reporting data that was in the OpenType registry of language-system tags. In principle, changes such as Martin suggested to use IETF Language Tags with region or script subtags might make sense, but in practice that would be attempting to do something that goes beyond the intent of the OpenType tag registry and that would not be highly feasible: to equate every language system tag with a _specific_ and equivalent IETF language tag. The simple explanation is the one Andrew gave: these things are not comparable – unless we want to introduce variant subtags for typographic conventions, and I’m not sure that makes sense. My add’l explanation is that it would not be a simple task, and I don’t think it would be worth the effort. Thus, such changes will *not* be made. As for “Chinese Phonetic”, I mentioned above that most of the tags were registered years ago without documentation. Thus, it’s not clear what was meant when ZHP was first submitted. It probably was Pinyin, though that’s not certain. Now, I could go and revise the data in the OT tag registry to make the intent explilcit, describing ZHP as being for “Chinese Pinyin”. But I’ve got to ask: are there really different typographic conventions for Pinyin than for any other Latin-based writing system for Chinese? My guess is probably not. Peter From: ltru-bounces@ietf.org [mailto:ltru-bounces@ietf.org] On Behalf Of Andrew Cunningham Sent: Saturday, October 31, 2009 7:02 AM To: Martin J. Dürst Cc: Jonathan Kew; www-font; Håkon Wium Lie; www-style; Stephen Zilles; LTRU Working Group; Adam Twardoch (List) Subject: Re: [Ltru] font features in CSS 2009/10/30 "Martin J. Dürst" <duerst@it.aoyama.ac.jp<mailto:duerst@it.aoyama.ac.jp>> (OT) (ISO) Chinese Hong Kong ZHH zho Chinese Phonetic ZHP zho Chinese Simplified ZHS zho Chinese Traditional ZHT zho you are comparing apples and oranges ,as the expression goes. The ISO language codes and BCP47 are about languages. The table from teh OT spec is NOT a language idfentifier. It identifies what the OT spec refers to as a language system. According to the spec: "Language system tags identify the language systems supported in a OpenType Layout font. What is meant by a “language system” in this context is a set of typographic conventions for how text in a given script should be presented. Such conventions may be associated with particular languages, with particular genres of usage, with different publications, and other such factors." The language system tag could map to one language, to multiple languages, and in unexpected ways. Grouping commonalities in orthographic representation and typesetting traditions are the core aspect as far as I can tell, rather than language identification. At least thats my partial understanding. Andrew -- Andrew Cunningham Vicnet Research and Development Coordinator State Library of Victoria Australia andrewc@vicnet.net.au<mailto:andrewc@vicnet.net.au> lang.support@gmail.com<mailto:lang.support@gmail.com>
Received on Friday, 30 October 2009 22:54:22 UTC