[css3 writing modes] text orientation discussion from John Daggett on 2011-07-24 (www-style@w3.org from July 2011)

From: John Daggett <jdaggett@mozilla.com>
Date: Sun, 24 Jul 2011 00:15:19 -0700 (PDT)
To: "www-style@w3.org Style" <www-style@w3.org>
Message-ID: <1786036602.623001.1311491719781.JavaMail.root@zimbra1.shared.sjc1.mozilla.com>
Eric Muller from Adobe and I met Thursday to discuss text orientation in
the context of vertical text layout.  Below is the summary from that
discussion.

I started by talking about the current spec wording for text orientation
which says that if a font contains 'vrt2' feature info that it's assumed
to be correct, i.e. the distinction between which characters are rotated
and which are shown upright is effectively determined by the font [1]. 
If a font doesn't support this feature then the information is
"synthesized" using a non-normative appendix.  I suggested that we
really need a normative definition of this that wasn't tied to
information in fonts and Eric agreed.

One obvious question is whether we're assuming that authors will
continue to use the JIS model of separate codepoints for standard and
full-width versions of the Latin codepoints, where the full-width
codepoints are upright and the standard ones rotate by default.  I noted
that the JLREQ seems to address this obliquely by not referring at all
to the issue, as if fullwidth codepoints were a thing of the past.  But
Eric seemed to think that while not entirely desirable, it was fair to
assume that the use of both sets of codepoints would continue, authors
are already accustomed to and understand the differences, input methods
support it, etc.  Neither one of us was sure whether the same model
exists for China/Taiwan/Hong Kong (i.e. whether separate codepoints with
different orientation properties were used).

Eric noted the work on formalizing layout requirements in various
countries, China and India for example.  But Japanese is the only case
where we really understand the requirements in some detail so we should
just assume that our initial definition of vertical layout controls is
intended to handle Japanese and try and structure the definition to
allow room for change if clearer requirements come out of the efforts in
China/Taiwan/Hong Kong.

We talked briefly about the nature of the vrt2 and vert OpenType
features.  The idea behind the 'vrt2' feature was that both ideographic
and non-ideographic text could be laid out using a single codepath,
without the need for different codepaths for vertical runs and runs laid
out horizontally and then rotated.  I pointed out that this only seems
suitable for systems where all fonts have 'vrt2' defined and it's always
defined "correctly", i.e. it rotates or leaves upright the "right"
characters.  Eric confirmed this and pointed out that you need to handle
fonts that *don't* have a 'vrt2' feature defined (e.g. Minion, since
it's a Latin font and not a CJK font) so the feature actually isn't
useful for layout in practice.  Another problem with 'vrt2' is that it's
mixing two substitutions, (1) rotated glyphs and (2) vertical
alternates.  The 'vert' feature only supports (2).  An example which
illustrates the difference are the kumimoji characters (U+3300:3358). 
As has been pointed out, InDesign only uses the 'vert' feature.

I pointed out some of the inconsistencies I was seeing in vrt2
implementations [2].  For example, while basic Latin characters are
rotated, Greek and Cyrillic are not (i.e. with 'vrt2' applied, the font
doesn't supply rotated glyphs for Greek or Cyrillic.  Meiryo seems to
have lots of inconsistencies, it lacks rotated glyphs for some of the
supplementary Latin characters and currencies (e.g. the Euro character
U+20ac rotates, but the won character U+20a9 doesn't).  The handling of
arrows seems funny, some sets of arrows rotate, some don't.  Eric noted
that it's often hard to determine whether an arrow means "to the right"
or serves as a pointer to what follows.

It seems to me that for CSS, rather than relying on font information
(i.e. the vrt2 feature substitutions), that it would make more sense to
define explicitly which codepoints are by default considered upright and
treat all others as rotated.  For upright codepoints, the 'vert' feature
must be applied for fonts supporting it.  Eric at first seemed to think
that it might be better to require it for *all* codepoints when
displaying vertical text (so that variants needed for the rotated case
also can also be provided) but then he thought maybe what is needed is
actually two properties, one to inform the font that vertical layout is
being performed and another to distinguish between the rotated and
non-rotated case.  But for now, applying 'vert' to only upright spans is
fine, we can consider additional OpenType features at a later point.

After this discussion, we both agreed that it would be good to have a
simple property defined for all Unicode codepoints that allowed us to
determine whether a character was upright by default or not.  I was
originally thinking of this as "these characters are upright, all others
rotate" but Eric suggested that we define a property with a few more
categories and then describe rules that use this property to determine
orientation.  Categories might include 'ideographic' and 'non-ideographic'
and categories to group sets of symbols (e.g. 'currency', 'unit', etc.).

This would provide more flexibility in defining differences in which
rules to apply without modifying the underlying property.  This may be
important in the future if it's determined that vertical text in Chinese
should be handled differently than the default. It also gives us
property values to use in defining contextual rules for things like
symbols (e.g. "symbols follow the rule for the script of the surrounding
text").

Eric noted that in some ways defining the properties to align to the
classes listed in Appendix A of the JLREQ document [3] would help in the
ability to do aki processing correctly but this isn't really a
requirement at this point, only something to consider in the future.

Eric said that there's a Unicode meeting in August and that he can
discuss a new 'orientation' property, what it's needed for and how it
would be used.  We agreed that this should not be a derived property,
i.e. that it's not something that can be directly derived from other
Unicode properties.  This is because there are going to be codepoints
for which some judgement is involved.  The key is to have a reasonable
default, not necessarily the "ideal" default.

[1] CSS3 Writing Modes text-orientation property
http://dev.w3.org/csswg/css3-writing-modes/#text-orientation

[2] vrt2/vert testpage
http://people.mozilla.org/~jdaggett/tests/textorientation.html

[3] JLREQ character classes
http://www.w3.org/TR/jlreq/#character-classes-en
Received on Sunday, 24 July 2011 07:15:47 UTC