RE: [css3-writing-modes] use case for font-dependent default orientation from Koji Ishii on 2011-09-11 (www-style@w3.org from September 2011)

From: Koji Ishii <kojiishi@gluesoft.co.jp>
Date: Sun, 11 Sep 2011 10:17:18 -0400
To: John Daggett <jdaggett@mozilla.com>, W3C Style <www-style@w3.org>
CC: fantasai <fantasai@inkedblade.net>
Message-ID: <A592E245B36A8949BDB0A302B375FB4E0CF6823AC2@MAILR001.mail.lan>
Thank you John for making the list. This is a great list to look at differences between Eric's proposal and the current spec.

The use case of font-dependent orientation is to solve unified code points, in particular, unified punctuations. I'm not insisting on the current font-dependent orientation is the only solution, I'd be happy to hear any other proposal to solve the issue, but I believe the issue must be solved.

An example is U+2030 PER MILLE SIGN. U+0025 PERCENT is lucky to have its full-width counterpart, so U+0025 is sideways while U+FF05 (FULLWIDTH PERCENT SIGN) is upright.

U+2030 PER MILLE SIGN does not have its full-width counterpart and therefore it is the code point to map from East Asian legacy encodings for Japanese, Korean, and Simplified Chinese. You type "per mille" (in Japanese Kana) in Japanese Windows and you'll get U+2030. Word sets it upright. InDesign sets it upright too. So setting it sideways just breaks all existing documents and make HTML/CSS hard to display existing documents.

But U+2030 is "General" punctuation, not East Asian punctuation, so setting it upright doesn't work for non-East Asian context. "Use-font" is the idea fantasai and I came up to solve this issue, but I'd be very happy if anyone has any other idea to solve this issue.

Another example is PUA. PUA is widely used in East Asian, so we want them upright. But PUA is not only for East Asian; it is widely used in other scripts as well. I couldn't find any alternative solution to solve these issues without using font information.

Eric's approach solves the unified punctuation issue partially. He labeled some of those "unified and current CSS spec says S or UF" code points to just "U". This solves for the most common East Asian documents. If we can make all such unified code points to U, I can agree with that, but is it possible? Such code points include U+00B1 PLUS-MINUS SIGN and U+00B7 MIDDLE DOT (well, U+00B1 is S in current spec and I have to admit that I have to work on further details.)

That makes me wonder, rather than discussing on each code point, maybe we should first agree on what the goal is.

Originally, fantasai and I were trying to create a list that works good for both multi-lingual documents and for East Asian documents. That turned out to be extremely difficult and we made a lot of compromises based on our thoughts but I understand such compromises can vary by people, and therefore this discussion appears.

Given now we have text-orientation: sideways-* and upright for non-East Asian, I'm leaning to define upright-right to be more East Asian centric, and disregard multi-lingual capability when it contradicts with compatibility. If compatibility becomes less problems and multi-lingual is more important in future, we could define another value. So, if we could agree on that "upright-right" has priority on compatibility with existing documents rather than multi-lingual capability, I believe it can solve most of unified punctuation issues.

Even after that, I'm still not sure if we can completely avoid "font-dependent orientation" though. MS Word does this since Word 6.0. InDesign does this since 1.0. You can see this by typing U+2030 and apply Japanese font to see it in upright, or Roman font to see it in sideways. If our goal of CSS Writing Modes Level 3 is similar to what MS Word 2.0 does, maybe we can (Word 2.0 East Asian version sets U+2030 always upright.) Vertical text flow in CSS is still version 1.0 (or pre-1.0 :) so I can live with that if that's what all the people wants.

Regarding whether "use-font" is per code point of per font, there was a misunderstanding between fantasai and me; I thought it's per code point, while fantasai thought it's font based. We haven't had a good idea how technically feasible "per code point" is. Recently Murakami-san told fantasai and me that it's not that difficult to do "per code point", but FreeType doesn't have an API to read GSUB map (it can apply the map though.) We may be able to try to find contributors to implement "read map" feature to FreeType. Given that situation change--if I understand correctly--both fantasai and I are leaning towards to "per code point" based approach. That is, to change current "use-font" to "use alternate glyph if exists, otherwise sideways". This category is another, and better, solution to the unified punctuation issue I believe, but as I said above, I'd be extremely happy to hear any other proposals to solve the issue.


Regards,
Koji

-----Original Message-----
From: John Daggett [mailto:jdaggett@mozilla.com] 
Sent: Friday, September 09, 2011 5:35 PM
To: W3C Style
Cc: Koji Ishii; fantasai
Subject: [css3-writing-modes] use case for font-dependent default orientation

During the call this week, we discussed the definition of default
orientation in the CSS3 Writing Modes spec.  The current draft has a
definition in Appendix C [1] but Eric Muller of Adobe has proposed making
an explicit Unicode property that defines categories for use in
determining the default orientation [2].

To view the differences in key Unicode blocks, I've put together a data file
that illustrates the characters affected by these differences:

http://lists.w3.org/Archives/Public/www-archive/2011Sep/att-0010/defaultorientation.pdf


The second and third columns are the classification in the proposed
Unicode property and using the Appendix C algorithm, along with
vertical alternates if they exist for three standard Japanese fonts. 
S indicates sideways orientation, U indicates upright and UF indicates
it's dependent on the font.

This last category is one of the key differences between Eric's
proposed property values and the definition in Appendix C.  The
current definition in the spec has rules that define orientation such
that for some characters the orientation is defined as dependent on
whether a font has vertical font metrics or not; if a font has
vertical metrics then these characters will be displayed upright and
vertical alternates will be used if available, otherwise the
characters are drawn sideways.

This rule adds a lot of complexity to vertical layout, it means that
an implementation needs to first do font matching, then break up text
runs into sideways and vertical runs. What are the set of use cases
for this?  And does this rule really solve those use cases, or just
fix some cases while breaking others?

When I asked last month, Koji explained the existence of a vertical
alternate glyph for a given codepoint indicates that a font designer
thinks that codepoint should be set upright using that alternate glyph
[3].  But the Appendix C algorithm is keyed off whether vertical
metrics exist for a font, not whether there's a vertical alternate.

The majority of these "use font" codepoints appear to be brackets or
other punctuation characters in Unicode common blocks but many often
lack vertical alternates so I don't see that this is solving much for
those characters.

What are the specific use cases for special handling these characters?
Are these primarily for Chinese?  And are there no alternative
codepoints in the ideographic punctutation or fullwidth blocks that
would actually be more natural for authors to use (i.e. the default
characters when using an input method)?

For example, the parentheses, braces and brackets tagged as "use font"
in the Basic Latin block almost never have vertical alternates.  But
their analogues in the Halfwidth and Fullwidth Forms block
(U+FF00-FFEF) usually do, at least for Japanese fonts.  Similarly, are there
punctuation marks from the General Punctuation block (U+2000-20FF)
that are preferred to the marks in the CJK Symbols and Punctuation
block (U+3000-303F) or fullwidth block? This is important because
codepoints in the CJK and fullwidth blocks are naturally upright and
the existence of vertical alternates is much more consistent compared
to codepoints in the general block.

Regards,

John Daggett

[1] http://dev.w3.org/csswg/css3-writing-modes/#vertical-typesetting-details

[2] http://lists.w3.org/Archives/Public/www-style/2011Sep/0003.html

[3] http://lists.w3.org/Archives/Public/www-style/2011Aug/0353.html
Received on Sunday, 11 September 2011 14:16:54 UTC