- From: Eric Muller <emuller@adobe.com>
- Date: Tue, 13 Sep 2011 11:05:52 -0700
- To: www-style <www-style@w3.org>
- Message-ID: <4E6F9B80.4090203@adobe.com>
I think there are three fundamental approaches. In all cases, there will be scenarios in which the determination is not what the user wants, and consequently, we need some kind of markup to impose an orientation; so we can focus on the determination in the absence of markup: 1. the character alone determines the default orientation 2. the character and the font used to display the character determines the default orientation 3. the character and its context (neighboring characters) determine the default orientation In my experience, the character context is extremely difficult to use. Consider the quotes (U+201C “ LEFT DOUBLE QUOTATION MARK and friends); if they are used to bracket sideways text, they should probably go sideways, while if they bracket upright text, they should probably to upright. The problem is that it is difficult to reliably determine mechanically what is bracketed, because the same character can be used to start bracketing in some cases and to end bracketing in others; and there are also cases where this same character is used for other purposes than bracketing. Layout is just too low-level (i.e. not enough is known about the text) to make the proper analysis of the text. The font context is also difficult to use. The problem here is that we have to use circumstantial evidence of the font content, there is no data in the font that specifically answers our question. I have seen a variety of circumstantial evidence being used (for this and other problems): which cmap subtables are present, which characters have glyphs (other than .notdef), which OS/2 ulRange bits are set, whether there are vertical metrics, whether a glyph width is 1em, whether the GSUB 'vert' feature is present, whether GSUB 'vert' changes the glyph, the CID of the glyph for CFF/CID-Keyed/well-known ROS fonts, etc. At the end of the day, all those methods have proven to be fragile (15 years later, we are still tweaking the heuristics in our products, and we still get complaints), and they are not surviving the web world very nicely (e.g. runtime font fallback, font subsetting, etc). They also reflect the primary use that the font designer had in mind, rather than what the document author has in mind. And in addition, we have the considerations that John mentioned, i.e. the complexity of layout engines. That's why I think we should go with 1: the character alone determines the default orientation. It is simple, it is robust. I also think that it is adequate, i.e. we will need little markup if any in the vast majority of documents. I agree with Koji that if " [upright] has priority on compatibility with existing documents rather than multi-lingual capability, I believe it can solve most of unified punctuation issues." Don't be too concerned by the values I did put in my proposal, in particular for the punctuation; they were just to get a starting point. By the way, the logic I used was broadly aligned with what seems agreeable to you: only characters which are definitely not part of the Japanese writing system, in a broad sense, are sideways. A fourth alternative has been mentioned: the character and its locale. This does not have the problem of 3, as we there is no analysis of the text. The big question in my mind is whether that buys enough to warrant the complexity. I think we need very specific scenarios before we can decide that. I also share some of the concerns expressed by Koiji, in particular the overload of functionality (layout, spell checking, speech synthesis) on a single thing. I am not worried about a mismatch with existing authoring applications, such as InDesign or Word. They can do whatever they want to determine the orientation, and at the time they generate HTML, they can compare the orientation they determined with the orientation mandated by CSS, and insert markup as needed. In fact, the simpler the CSS determination, the more robust this is. Eric.
Received on Tuesday, 13 September 2011 18:07:13 UTC