Re: per-paragraph auto-direction, a.k.a. dir=uba

On 09/14/2010 08:39 AM, Aharon (Vladimir) Lanin wrote:
> I second Mati's proposal. As Ehsan has pointed out <textarea readonly>
> can be used to display plain text (as opposed to edit it), even though
> somewhat awkwardly. And if we work out a bigger proposal, we can tweak
> the bugs filed on HTML5 and CSS3. Right now it is imperative to produce
> a new draft of the proposal and file the bugs (October 1 deadline for
> HTML5..., apparently).
>
> I do want to put dir=uba under the dir=auto umbrella, via
> autodirmethod=first-strong|any-rtl|uba. autodirmethod inherits; dir does
> not. The default would probably be first-strong.
>
> Is this agreeable to everyone? Please respond.
>
> Here is a quick spec for dir=auto:
>
>     * Using dir=auto with autodirmethod=uba would (by default) set
>       unicode-bidi to "uba" and direction according to first-strong.
>       (Note that this includes leaving direction at the inherited value
>       if the content is neutral.)
>     * For elements other than <textarea>, unicode-bidi:uba is treated as
>       unicode-bidi:isolate.
>     * On <textarea>, unicode-bidi:uba means that:
>           o The UBA on the textarea content is invoked specifying only a
>             default paragraph level (in icu4j terminology, either
>             LEVEL_DEFAULT_LTR or LEVEL_DEFAULT_RTL), based on the the
>             element's own direction value as calculated above. (This
>             makes the all-neutral paragraphs use the same direction as
>             the first paragraph that is not all-neutral.)
>           o Each UBA paragraph’s lines’ alignment is determined by the
>             paragraph’s resolved base level when the element's
>             text-align is start or end.

This is too complicated. If uba cannot in fact be triggered on anything
other than a <textarea>, then it should not be allowed on anything other
than a <textarea>. I suggest having

   dir=ltr|rtl|auto|plaintext
   autodirmethod=first-strong|any-rtl

I suggest "plaintext" instead of "uba" because it's clearer what the
behavior and the intended use case is. (Since we're only allowing uba
on <textarea> and using dir=auto for first-strong, we don't need the
name to be so cryptically short.)

> # The part of the text after the first X characters (where the text in
> nodes excluded above are not part of the count). Do we need this? If
> so, what's a good X value? 100?

And I think that for any-rtl having an X value is both better for
performance and more likely to give good results. If the first X
characters are LTR, where X is longer than most LTR phrases commonly
imported into RTL text, chances are any RTL characters after that
are not indicating the paragraph's main direction.

~fantasai

Received on Tuesday, 14 September 2010 18:56:45 UTC