RE: [css3-writing-modes] use case for font-dependent default orientation from Koji Ishii on 2011-09-12 (www-style@w3.org from September 2011)

From: Koji Ishii <kojiishi@gluesoft.co.jp>
Date: Mon, 12 Sep 2011 11:05:12 -0400
To: John Daggett <jdaggett@mozilla.com>
CC: fantasai <fantasai@inkedblade.net>, W3C Style <www-style@w3.org>
Message-ID: <A592E245B36A8949BDB0A302B375FB4E0CF6823B28@MAILR001.mail.lan>
> Koji Ishii wrote:
>
> > Regarding whether "use-font" is per code point of per font,
> > there was a misunderstanding between fantasai and me; I thought
> > it's per code point, while fantasai thought it's font based. We
> > haven't had a good idea how technically feasible "per code
> > point" is.
> 
> Just to be clear, the logic you're describing here is not what is
> in the spec, where orientation is defined for the "use font"
> codepoints (listed again below for reference) as upright or
> sideways based on whether the font has vertical metrics or not.
> 
> What you seem to be describing (please correct me if I'm wrong) is:
> 
>   For codepoints categorized as "use font", if a vertical alternate
>   exists in the font used for a given character, set the character
>   upright and use the vertical alternate, otherwise the character
>   as having a sideways orientation and render rotated.
> 
> Is that closer to what you and fantasai are thinking/proposing?

Yes, and this is 4th category I'm thinking to add to solve a problem in the current spec, in addition to the current "upright", "sideways", and "use-font" categories. Fantasai told me that if we can merge this with "use-font", and I was thinking its possibility, but you pointed out that PER MILLE cannot be this definition and you're right, so I guess we need 4 categories.

U+2030 PER MILLE should be "use-font" as it is currently spec'ed; upright if the font is East-Asian and the decision is per-font, and that's what Word/InDesign does today, right? Dashes and parenthesis should be "use vertical alternate if exists, otherwise sideways".


> > Recently Murakami-san told fantasai and me that it's not that
> > difficult to do "per code point", but FreeType doesn't have an
> > API to read GSUB map (it can apply the map though.) We may be
> > able to try to find contributors to implement "read map"
> > feature to FreeType. Given that situation change--if I
> > understand correctly--both fantasai and I are leaning towards
> > to "per code point" based approach. That is, to change current
> > "use-font" to "use alternate glyph if exists, otherwise
> > sideways". This category is another, and better, solution to
> > the unified punctuation issue I believe, but as I said above,
> > I'd be extremely happy to hear any other proposals to solve the
> > issue.
> 
> I think you should be *very* clear about what you're doing here,
> you're putting a big special case in the middle of a text layout
> engine.
> 
> Normally, a run of vertical text would be separated into upright
> runs and rotated runs, broken down into scripts, matched with
> fonts, and the characters laid out by (1) looking up the default
> glyphs in the character map (cmap) and (2) running through the
> features in the font to do both glyph substitution and
> positioning.
> 
> Your special case would effectively be at the *end* of the
> substitution phase, in other words you'd have to trace back and
> figure out whether a 'vert' substitution occurred in addition to
> other substitutions for a given character. Simply dumping out
> glyph substitutions for the 'vert' feature and backmapping that
> to the underlying character would not be correct.  For these "use
> font" codepoints you'd need to double check whether a new
> sideways or upright run needed to be added. Not impossible, just
> complicated and something that we should take pains to avoid
> unless absolutely necessary.

Can I ask which one you're talking about? "Upright if the font has vertical settings" or "Alternate glyph or fallback to sideways", or both?

I'm not very familiar with browser rendering code, and I understand that font fallback makes browsers somewhat different from regular word processors. You explained to me that browsers splits text into runs, and try to match with fonts. I'm guessing--I'm sorry if I were wrong here but--during the matching with fonts phase, since some code points can fallback to different fonts, the run can be split further at the phase, right? Couldn't glyph orientation determined at that point?

You're far superior in designing browser code than me, so I don't think I can be a much help here. But I guess browsers already have a code to split runs based on font information and guess there should be a way to do that.

"How much it is necessary" is a difficult question. The former is required to solve unified punctuation issue in the similar way as Word/InDesign has solved (unless we can come up with other idea to solve the issue.) If CSS Text Level 3 can't solve the issue, I guess people might say it's still better than nothing. It was normal behavior of word processors in early '90s.

But I believe today's expectation is at the similar level as what Word 6.0 or later and InDesign provides. If you can agree with that, we need a solution for unified punctuation code points and PUA.

The latter category has different use cases and therefore different priority. I'm thinking we can get rid of that if we can trust in fonts having the correct GSUB table, but I haven't finished the investigation yet. Sorry for the slow work, but can I have a bit more time?


> > The use case of font-dependent orientation is to solve unified
> > code points, in particular, unified punctuations. I'm not
> > insisting on the current font-dependent orientation is the only
> > solution, I'd be happy to hear any other proposal to solve the
> > issue, but I believe the issue must be solved.
> 
> The basic problem I see with making a codepoint's orientation
> font-dependent is that you make it a property of whatever the
> font designer assumed was the default context (e.g. a Japanese
> font would assume it's upright, a Western font wouldn't) and via
> font fallback you may end up using a font for which those
> assumptions don't match the content.
>
> For example, if a designer uses a vertical writing mode to get
> rotated table headings in a non-CJK language they would be
> confused as to why their double quotes appeared upright (due to
> the fact that the double quote character was pulled from a
> Japanese font via font fallback).

Non-CJK case should be separated because we have separate values in text-orientation. The "upright-right" should be a good value for East Asian scripts, while "sideways-*" and "upright" should be designed for other scripts as well, and we're discussing on "upright-right" here, right?

I do understand font fallback can cause troubles though. I had a discussion with a few folks in Japan about this. Murakami-san raised that, if we go with the current spec, and if a Japanese font is missing U+2030 PER MILLE SIGN for instance, the glyph orientation can change. The conclusion from our discussion was that it's not an issue for two reasons. One is it's very unlikely that a Japanese font is missing such an important code point for Japanese text. The other is, technically speaking, font vendors can put sideways glyphs into vertical alternate, so UA can control how it renders, but the final visual orientation is really up to the fonts. That said, we have to trust fonts to have somewhat consistent behavior and doing the right thing anyway. Given such situation, worrying about missing U+2030 in Japanese fonts is much less likely. I do agree that fonts should improve the situation though, and I'm hoping Eric's proposal can make it better.

 
> One alternative here is to use some form of context marker (e.g.
> the language applied, the surrounding script) to infer whether
> something is handled as upright/sideways.  So for some codepoints
> in the General Punctuation range, they would be contextually
> upright (and vertical alternates applied) for runs of Japanese
> text and sideways otherwise.

Yeah, while I understand it has some good cases, I'm worried it may bring more confusion especially in editing experiences than it makes things better. I'm also worried developing such contextual logic is very difficult and delicate. I'd be more comfortable if such contextual automatic behavior being part of editing applications just like Word applies formatting as you type. Users then can choose their favorite applications to edit HTML/CSS, and re-apply different formatting and/or undo the behavior if they want.


> I'm also really puzzled by why parentheses, braces and brackets
> in the Basic Latin range are included in your "use font"
> designation.  That doesn't match any current implementation
> (webkit/ie9) and all of those code points have alternatives that
> are more natural to use within Japanese text.  Nor have I seen a
> Japanese font that has vertical alternates for these codepoints.

I agree that it's a bug in the current spec. Fantasai and I discussed last week to add a rule saying "if a code point has full-width counterpart, make it sideways," and I believe it can resolve the bug. I haven't fixed my table by adding this rule yet, will do very soon.


> I think if we can figure out a way to define orientation that is
> not font dependent as it is in the current spec, then we'll be
> very close to having a proposal that's good enough for a first
> version of this spec.

It's true that the current spec has some issues and we need to revise that. But at least the current spec proposes one solution to the unified punctuation and PUA issues. I would like to hear, if we're going to go with orientation without font dependency, how it can solve the issue. Well, or to conclude "CSS not being able to solve the issue" is also an option, but I'd like to hear what you would say about this.

One thing to add; I talked with a guy in Toppan Printing Co., Ltd., one of the biggest printing company in Japan. He told me that they have a table of "default glyph orientation" to apply when author did not specify. I asked him to provide it to us, so we might be able to get it soon.


Regards,
Koji
Received on Monday, 12 September 2011 15:05:40 UTC