RE: [Ltru] font features in CSS

Btw, Martin: it’s not clear to me what outcomes you have in mind for bringing the CSS thread into the IETF Languages list. Since this is now getting posted on two lists, it could easily become a nuisance for both. It would probably be helpful to identify specific questions that are of common interest, and keep this thread limited to that. So, can you clarify what issues you think need to be discussed across the two lists.

You made statements about HTML lang and xml:lang, and those statements seem to be correct. You suggested changes to the data in the OT tag registry; that registry is only tangentially relevant for this list, I think, though I have commented on your suggestions.


From: ltru-bounces@ietf.org [mailto:ltru-bounces@ietf.org] On Behalf Of Peter Constable
Sent: Saturday, October 31, 2009 7:54 AM
To: Andrew Cunningham; Martin J. Dürst
Cc: LTRU Working Group; www-font; Håkon Wium Lie; www-style; Jonathan Kew; Stephen Zilles; Adam Twardoch (List)
Subject: Re: [Ltru] font features in CSS

Indeed, the OT language system tags are about typographic conventions. Now, many languages have a single conventional writing system and a single set of conventions for typography for that writing system, though conventions may different for other languages with writing systems based on the same script. A familiar example is Serbian, which has distinctive italic forms for certain letters making its typographic conventions different from, say, Russian. There may also be cases within a single language and writing system of multiple typographic conventions. For instance, Malayalam is taught today using fewer conjunct forms than were used in the past.

In principle, it may be reasonable to say that two or more languages can be described as having a common set of typographic conventions. For instance, I don’t know of any particular reason why font rendering should differ according to whether the text is English or Spanish. In practice, though, it would be very difficult to manage a system that organized the data that way: it’s far easier to allow for a default OpenType language system tag for every language, keeping in mind that OpenType fonts have a default language system and that an explicit language system only needs to be incorporated into font data and invoked when rendering if there are distinctive typographic behaviours.

Btw, I was the one that added the ISO 639 data to the table in the OT tag registry. You’ll see that there are exceptions to what I just said: some of the OT lang system tags are associated with multiple ISO 639 codes, hence multiple languages. The list of OT lang system tags were a given: someone populated the registry with many entries several years ago, adding entries before there were known needs and without first explicitly adopting and declaring a conventional set of language identities – ISO 639-3 didn’t exist then, so it wasn’t an option. Moving forward several years, many of the OT lang system tags had become a bit of a mystery, and so I felt it made sense to introduce mappings to ISO 639 in order to establish a sensible meaning for as many as possible. When faced with a lang system tag such as ATH (Athapaskan), having no documentation of what the submitter for that tag had in mind, the best alternative I had was to associate that with all Athapaskan languages.

That’s all background on the conceptual issue Andrew raised. Let me comment on what is meant by “Chinese Phonetic”, and on Martin’s comments generally, starting with the latter.

Martin suggested that the info Adam Twardoch reported (on some other list than this) should be revised. Adam was merely reporting data that was in the OpenType registry of language-system tags. In principle, changes such as Martin suggested to use IETF Language Tags with region or script subtags might make sense, but in practice that would be attempting to do something that goes beyond the intent of the OpenType tag registry and that would not be highly feasible: to equate every language system tag with a _specific_ and equivalent IETF language tag. The simple explanation is the one Andrew gave: these things are not comparable – unless we want to introduce variant subtags for typographic conventions, and I’m not sure that makes sense. My add’l explanation is that it would not be a simple task, and I don’t think it would be worth the effort. Thus, such changes will *not* be made.

As for “Chinese Phonetic”, I mentioned above that most of the tags were registered years ago without documentation. Thus, it’s not clear what was meant when ZHP was first submitted. It probably was Pinyin, though that’s not certain. Now, I could go and revise the data in the OT tag registry to make the intent explilcit, describing ZHP as being for “Chinese Pinyin”. But I’ve got to ask: are there really different typographic conventions for Pinyin than for any other Latin-based writing system for Chinese? My guess is probably not.



Peter

From: ltru-bounces@ietf.org [mailto:ltru-bounces@ietf.org] On Behalf Of Andrew Cunningham
Sent: Saturday, October 31, 2009 7:02 AM
To: Martin J. Dürst
Cc: Jonathan Kew; www-font; Håkon Wium Lie; www-style; Stephen Zilles; LTRU Working Group; Adam Twardoch (List)
Subject: Re: [Ltru] font features in CSS


2009/10/30 "Martin J. Dürst" <duerst@it.aoyama.ac.jp<mailto:duerst@it.aoyama.ac.jp>>

                        (OT)    (ISO)
Chinese Hong Kong       ZHH     zho
Chinese Phonetic        ZHP     zho
Chinese Simplified      ZHS     zho
Chinese Traditional     ZHT     zho



you are comparing apples and oranges ,as the expression goes. The ISO language codes and BCP47 are about languages.

The table from teh OT spec is NOT a language idfentifier. It identifies what the OT spec refers to as a language system. According to the spec:

"Language system tags identify the language systems supported in a OpenType Layout font. What is meant by a “language system” in this context is a set of typographic conventions for how text in a given script should be presented. Such conventions may be associated with particular languages, with particular genres of usage, with different publications, and other such factors."

The language system tag could map to one language, to multiple languages, and in unexpected ways. Grouping commonalities in orthographic representation and typesetting traditions are the core aspect as far as I can tell, rather than language identification.

At least thats my partial understanding.

Andrew

--
Andrew Cunningham
Vicnet Research and Development Coordinator
State Library of Victoria
Australia

andrewc@vicnet.net.au<mailto:andrewc@vicnet.net.au>
lang.support@gmail.com<mailto:lang.support@gmail.com>

Received on Friday, 30 October 2009 23:02:57 UTC