Re: [css3-writing-modes] bidi-style resolution of punctuation orientation from fantasai on 2011-07-15 (www-style@w3.org from July 2011)

From: fantasai <fantasai.lists@inkedblade.net>
Date: Thu, 14 Jul 2011 17:35:47 -0700
To: www-style@w3.org
Message-ID: <4E1F8B63.2040406@inkedblade.net>

On 07/12/2011 09:25 PM, Stephen Zilles wrote:
>
> A different way at looking at this problem is to use the rules for script determination
> that are expressed in Unicode TR24, particularly sections 2.2 and 2.3. The main idea is
> that most scripts (not all) have a natural orientation (horizontal or vertical). [As you
> note elsewhere, some scripts like East Asian and Mongolian based can occur in either
> orientation.] Therefore, by using the suggested rules by which neutral (or inherited)
> characters are given a script, the then script classified characters can be given the
> same treatment as the script to which they are assign. [Note there are some important
> provisos related to how successful this script assignment can be.]

The rules in UAX24 for resolving the script of a series of combining characters are certainly
something we should adopt, but the problem of neutral punctuation that doesn't combine
remains.

The suggestion in UAX24 2.2 is to use the script context of the preceding character. Is
that what you meant?

> I am not at all sure why requiring additional markup when the wrong thing happens by default
> is a bad thing. Unicode Bidi recognized that the default algorithm will sometimes do the
> wrong thing and added special (non-displaying) characters to change the interpretation.
> Markup is not so different than this. Why is it bad?

Where did I say it was bad? I only said that we need a way to override the automagic. There's
better and worse ways to override the automagic, as in, ways that result in better markup
(matches more closely the linguistic structure) and ways that result in worse markup (result
in arbitrarily presentational markup).

> Perhaps, you have too many cases in the description that follows this comment.

You can't say I have too many cases. I'm writing a spec here, right? If it's not exhaustive,
nevermind the interop problems, jdaggett will chew me out for the next three telecons for
incompetency. :P

I can't throw all the characters in the So category into one category (I wish I could!)
because they can't all be treated the same way. So I needed to break it down. I did it by
category than by codepoint, as there are rather a lot of them (over 3000) and I needed
some way of thinking about the list abstractly.

> I tend to agree with Florian that picture based characters ought to always appear as a picture.

I'm perfectly happy with this conclusion. (It's what I have in the spec right now, actually.)

> [And, yes, I did read your example of a snowman in a winter phrase, but it was not a very
> convincing case, IMO.]

It's not supposed to be a convincing case, it's supposed to be an example so that as you're
thinking about the question of whether this class of characters should be upright or sideways
or context-dependent, you have a concrete example to imagine.

~fantasai

Received on Friday, 15 July 2011 00:36:27 UTC