Re: font features in CSS

Let me summarize a few important points about how OpenType Layout
features are meant to be used.

I. FEATURE CLASSIFICATION

OpenType Layout features can be divided into various categories, based
on various criteria (see [1] below for more information):

a) "Show in UI": determines whether a certain feature should be somehow
exposed in an application's UI. A different view of seeing this
categorization is whether a certain feature should be user-controllable
or if it should be controlled "behind the scenes" by the OpenType Layout
engine. Obviously, the user-controllable features should have some kind
of exposure through CSS.

b) "Script": indicates a generalization of which features are used with
which scripts (writing systems). Some OpenType Layout features are only
used with some particular scripts, while others are applicable to all
scripts.

c) "Script-specific shaping": the OpenType Layout process is divided
into three phases: before the script-specific shaping, during the
script-specific shaping and after the script-specific shaping. Some
features should be applied to all glyphs before the script-specific
shaping algorithms kick in, others are automatically applied by the
shaping engine during the script-specific shaping, and finally the last
group is applied after the script-specific shaping, and these are
typically the features that are user-controllable.

d) "Applied by default" indicates whether a feature should be applied by
default (while the user may or may not have the opportunity to turn it
off), or should be off by default (in which case the user should have
the opportunity to turn it on).

e) "Functional category": features can be coarsly split into two large
groups: one related to language (features that ensure that a certain
orthographic tradition is followed or even that the text is
orthographically correct at all) and one related to typography (features
that allow the user to select a certain typographic/stylistic treatment).

II. LOOKUP CLASSIFICATION

OpenType Layout features are realized through series of lookups that can
perform two types of actions: substitutions (replacing some glyphs with
others, stored in the OpenType GSUB table) and positioning (adjusting
the width and the x/y position of some glyphs, stored in the GPOS table).

The most important aspect of this is that while most features can be
just turn on and off (i.e. their status is binary), other features may
need an additional feature parameter. For example, if a feature such as
"salt" (stylistic alternates) is realized through GSUB LookupType 3
(alternate substitution, one-to-one-out-of-many), then it is necessary
to specify a numerical parameter that allows to select the alternate out
of the set of alternates.

III. LANGUAGE CLASSIFICATION

All OpenType Layout features are assigned in a context of specific
script and language system. While the assignment of script is easy (the
engine can determine from the Unicode string which script a certain
character belongs to, and from there it can pick the appropriate
OpenType script branch to apply the features for), the language system
is trickier.

As you can see from
http://www.microsoft.com/typography/otspec/languagetags.htm
OpenType uses a list of language systems that do not have a 1:1
correspondence with any of the ISO 639 standards. In OpenType 1.6 (at
the link above), an informational mapping of OpenType language system
tags and "best matches" in the ISO 639 standards has been provided. It
is quite obvious that a web browser that applies OpenType Layout
features should observe the HTML "lang" attribute and, if present, apply
the appropriate features from the particular language system branch in a
font (and only if absent, apply the features from the Default language
system within a script branch). But it might be worth considering to add
a low-level CSS access mechanism to allow users to choose a specific
OpenType language system, because some ISO 639 codes can map to several
OpenType language systems, e.g.

                        (OT)    (ISO)
Chinese Hong Kong   ZHH   zho
Chinese Phonetic  ZHP  zho
Chinese Simplified  ZHS  zho
Chinese Traditional  ZHT  zho

IV. HUMAN-READABLE vs. LOW-LEVEL OT FEATURE ACCESS

I realize that it is of great value to have a mechanism where the most
OpenType Layout features are accessed through human-readable CSS
properties. For some, such as the OpenType "smcp" feature, existing CSS
properties such as "font-variant: small-caps" should be used. For
others, new CSS properties such as those proposed by Jonathan et al.

However, in addition, I think it would be very useful to have a
low-level mechanism to specifically control the OpenType Layout features
directly. See [2] below for some thoughts that I had on that subject.

Best,
Adam

==

[1] Classification of OpenType Layout features, draft by Adam Twardoch:
http://www.twardoch.com/tmp/OpenTypeFeaturesClassification.xls

In the course of discussion in regard to the OpenType 1.5 and 1.6
specification revisions that took place last year, I have circulated a
draft categorization of OpenType Layout features based on some criteria.
The document mentioned above is that draft classification of the
OpenType Layout features that were registered in OpenType 1.5. I also
included the Microsoft-specific MATH engine features that are not
officially part of the OT spec, but I have not yet included the features
added to OpenType 1.6.

The document is an Excel spreadsheet with the following information:

"Tag" and "Friendly name" of all features found in OT 1.5 plus MATH

"Show in UI" which determines whether a certain feature should be
somehow exposed in an application's UI. "no" means that no UI is
necessary, "yes" generally means that a UI element should be exposed
that is directly related to the feature, "special" indicates a special
treatment for the UI, e.g. associating the feature activity with some
general-level application or document preferences (e.g. optical bounds
or CJK orientation).

"UI level" indicates at what level the UI should be implemented: none,
character, word, paragraph, document. Some features are sensibly applied
to just one character or a few, while others can be applied to long runs
of text.

"Script" indicates a generalization of which features are used with
which scripts. This is not 100% accurate, I think it'd be a good idea to
produce an exhaustive mapping of all registered features and mapping
them to all registered script tags. Currently, the OT spec has some
unclear wording e.g. "Indic scripts similar to Devanagari". So the
column sometimes uses script tags and sometimes generic terms like
"ALL", "INDIC", "ARABIC", "RTL". I think that it would be useful to
categorize the OpenType script tag list into such groups (so there is an
exhaustive mapping of which script tags are classified as "European",
"Indic", "Arabetic", "CJK" etc., plus which writing direction they may
have (three columns: LTR, RTL, vert). I'd like to add that to the 2nd
phase of the project.

"Script-specific shaping" is the column that has the actual
classification of when, in relation to script-specific shaping, a
feature is being applied: before the script-specific shaping (I was able
to come up with only four definitive entries for it: ccmp, locl, rtla
and size), during the script-specific shaping, or after the
script-specific shaping. Unfortunately, Adobe follows a different
paradigm of describing their features than Microsoft. I think Adobe's
CJK layout principles would be better off if described in form of a
shaping specification like Microsoft's, rather than spreading it around
the feature description list. Therefore, I have classified all of
Adobe's CJK features as "to be applied after shaping", since "shaping"
is not defined in this context -- though I think it could.

"Applied by default" indicates whether the feature should be always on
by default, never on by default, or whether shaping (or in CJK case,
orientation) determines if the feature is applied.

"Functional category" is just a loose way of classifying the OpenType
Layout features into some categories. There is a major distinction
between "language" and "typography" (there is such distinction in the
script-specific specs already), plus additional subcategorization into
"Asian CJK", "complex scripts", "basic support", "numerals and
scientific", "letter case" and "variants".

==

[2] Notes on a low-level tagging mechanism for OT Layout features in CSS

Below are some notes that I've written to Michael Jansson in 2006 when
he implemented his own extensions to CSS that allowed OpenType Layout
features selection in GlyphGate (this was done through the "text-otl"
CSS property). Now I realize that the particular syntax I proposed below
may not be most conformant to the CSS best practices, so the details of
the syntax might be revised, but I think it'd be rather worthy to have
this kind of mechanism.

The low-level access mechanism for OT Layout features should allow the
document designer to:

1. Explicitly turn OFF certain features (e.g. "-kern")
2. Explicitly call variant numbers in one-to-one-out-of-many
substitutions (e.g. "salt/3")
3. Explicitly specify the writing system of the text by specifying a
script tag (e.g. "latn/liga" vs. "arab/liga").
4. Explicitly specify the language system of the text by specifying a
language tag (e.g. "latn/liga" vs. "latn/TRK/liga").

I believe that all of the above would be useful. The parser should give
a higher priority to the specific OTL script and language tags and only
in their absence, infer the language from the HTML "lang" attribute, and
the script from the Unicode properties of the current text.

I believe my simple syntax that I proposed above would actually be
enough, given the specifics:

* all parts of the tagging are separated by slashes
* there are up to four parts (script/language/feature/variant)
* if there is only one part specified, it is the feature tag; it may be
prefixed with a "-" sign that signifies turning off features that might
be turned on be default;
* if there is more than one part, check the last part; if it is a
integer number, then it is the variant number; disregard this from the
remaining analysis;
* if there is one part, it is the feature tag;
* if there are two parts, the first is the script tag (which can be
"DFLT", all in uppercase, or otherwise can be a lowercase-only string of
four letters), the second the language, the third the feature.
* if there are three parts, the first is the script tag, the second is
the language tag, the third the feature tag.

Examples:
text-otl:liga
text-otl:-ccmp
text-otl:latn/salt/4
text-otl:latn/TRK/liga
text-otl:cyrl/SRB/locl
text-otl:DFLT/ornm/6

==

Received on Friday, 30 October 2009 07:13:37 UTC