Re: use case for font-dependent default orientation from Eric Muller on 2011-09-13 (www-style@w3.org from September 2011)

From: Eric Muller <emuller@adobe.com>
Date: Tue, 13 Sep 2011 11:05:52 -0700
To: www-style <www-style@w3.org>
Message-ID: <4E6F9B80.4090203@adobe.com>
I think there are three fundamental approaches. In all cases, there will 
be scenarios in which the determination is not what the user wants, and 
consequently, we need some kind of markup to impose an orientation; so 
we can focus on the determination in the absence of markup:

1. the character alone determines the default orientation

2. the character and the font used to display the character determines 
the default orientation

3. the character and its context (neighboring characters) determine the 
default orientation

In my experience, the character context is extremely difficult to use. 
Consider the quotes (U+201C “ LEFT DOUBLE QUOTATION MARK and friends); 
if they are used to bracket sideways text, they should probably go 
sideways, while if they bracket upright text, they should probably to 
upright. The problem is that it is difficult to reliably determine 
mechanically what is bracketed, because the same character can be used 
to start bracketing in some cases and to end bracketing in others; and 
there are also cases where this same character is used for other 
purposes than bracketing. Layout is just too low-level (i.e. not enough 
is known about the text) to make the proper analysis of the text.

The font context is also difficult to use. The problem here is that we 
have to use circumstantial evidence of the font content, there is no 
data in the font that specifically answers our question. I have seen a 
variety of circumstantial evidence being used (for this and other 
problems): which cmap subtables are present, which characters have 
glyphs (other than .notdef), which OS/2 ulRange bits are set, whether 
there are vertical metrics, whether a glyph width is 1em, whether the 
GSUB 'vert' feature is present, whether GSUB 'vert' changes the glyph, 
the CID of the glyph for CFF/CID-Keyed/well-known ROS fonts, etc. At the 
end of the day, all those methods have proven to be fragile (15 years 
later, we are still tweaking the heuristics in our products, and we 
still get complaints), and they are not surviving the web world very 
nicely (e.g. runtime font fallback, font subsetting, etc). They also 
reflect the primary use that the font designer had in mind, rather than 
what the document author has in mind.  And in addition, we have the 
considerations that John mentioned, i.e. the complexity of layout engines.

That's why I think we should go with 1: the character alone determines 
the default orientation. It is simple, it is robust. I also think that 
it is adequate, i.e. we will need little markup if any in the vast 
majority of documents. I agree with Koji that if "
[upright] has priority on compatibility with existing documents rather 
than multi-lingual capability, I believe it can solve most of unified 
punctuation issues." Don't be too concerned by the values I did put in 
my proposal, in particular for the punctuation; they were just to get a 
starting point. By the way, the logic I used was broadly aligned with 
what seems agreeable to you: only characters which are definitely not 
part of the Japanese writing system, in a broad sense, are sideways.

A fourth alternative has been mentioned: the character and its locale. 
This does not have the problem of 3, as we there is no analysis of the 
text. The big question in my mind is whether that buys enough to warrant 
the complexity. I think we need very specific scenarios before we can 
decide that. I also share some of the concerns expressed by Koiji, in 
particular the overload of functionality (layout, spell checking, speech 
synthesis) on a single thing.

I am not worried about a mismatch with existing authoring applications, 
such as InDesign or Word. They can do whatever they want to determine 
the orientation, and at the time they generate HTML, they can compare 
the orientation they determined with the orientation mandated by CSS, 
and insert markup as needed. In fact, the simpler the CSS determination, 
the more robust this is.

Eric.
Received on Tuesday, 13 September 2011 18:07:13 UTC