Re: per-paragraph auto-direction, a.k.a. dir=uba

Here is an updated spec for dir=auto and for
autodirmethod=first-strong|any-rtl|plaintext, attempting to reflect
discussion that has taken place since the previous attempt. Please respond!

   - The autodirmethod attribute value inherits if not explicitly specified.
   For the root element, it's first-strong by default.
   - Re dir attribute value inheritance, it is unchanged from current
   browser behavior (which is not exactly to HTML 4 spec). The current behavior
   can be specified in two alternative forms:
      - (In terms of CSS) The dir attribute value does not inherit. For the
      root element, it is "ltr" by default. In the absence of the dir attribute
      and an explicit CSS direction property, the CSS direction
property inherits,
      but this only affects elements that place their contents in a
separate UBA
      paragraph or embedding level, e.g. elements with CSS display other than
      "inline", elements with CSS unicode-bidi other than "normal",
<textarea> and
      <input type="text">.  Setting the dir attribute not only updates the CSS
      direction property but also sets the CSS unicode-bidi property to a value
      other than "normal", thus making sure that the direction property has an
      effect.
      - (More independent of CSS) For elements that place their contents in
      a separate UBA paragraph or embedding level, e.g. elements with
CSS display
      other than "inline", elements with CSS unicode-bidi other than "normal",
      <textarea>, and <input type="text">, the dir attribute inherits from the
      closest ancestor with a value (or "ltr" if none). For elements that do
      not place their contents in a separate UBA paragraph or
embedding level, the
      dir attribute is not inherited and has no default value.
   - Using dir=auto on an element with autodirmethod=first-strong|any-rtl
   would:
      - Make the default value for the ubi attribute ubi (i.e. on), as
      described elsewhere.
      - Set the CSS direction to ltr or rtl according to the indicated
      algorithm.
      - Invoke the indicated algorithm on the in-order traversal of the
      descendent text nodes, with the following exceptions:
         - Text nodes under a descendant element with an explicit dir
         attribute (including dir=auto).
         - The part of the text after the first 100 (*still no agreement on
         the value to use here...)* characters (where the text in nodes
         excluded above are not part of the count).
         - Parts of the text between an LRE, RLE, LRO, RLO, and its matching
         PDF.
      - The first-strong algorithm returns the direction of the first strong
      (L, AL, or R) character it encounters. If it does not encounter any, it
      returns ltr if it encounters any weak ltr characters (EN or AN).
If it does
      not encounter any of those either, it returns the inherited direction.
      - The any-rtl algorithm returns rtl if it encounters any strong RTL
      character, or ltr otherwise.
   - For <textarea> and <pre>, dir=auto with autodirmethod=plaintext would
   set unicode-bidi to "plaintext" and direction according to first-strong.
   *Alternatively, we can just leave the direction at the inherited value.
   The advantage of doing that is simplicity. The advantage of setting
   direction according to first-string is that this makes the all-neutral
   paragraphs in the element use the same direction as the first paragraph that
   is not all-neutral. Please indicate what you prefer.*
   - For elements other than <textarea> and <pre>, dir=auto with
   autodirmethod=plaintext is treated the same as dir=auto
   autodirmethod=first-strong.
   - unicode-bidi:plaintext definition (updated from current spec!):
      - For display:inline elements, the element is directionally isolated
      in a separate UBA paragraph, just like unicode-bidi:isolate.
      - For all UBA paragraphs that get their paragraph level (i.e. base
      direction) from this element, their paragraph level is
determined not by the
      element's computed 'direction' property as usual, but by
following rules P1,
      P2, and P3 of the Unicode bidirectional algorithm, i.e. their content.
      However, if no direction-determining character is found in step
P2, then the
      value of the ‘direction’ property is used instead. Thus, different UBA
      paragraphs within the element may have different paragraph levels.
      - If the element's text-align or text-align-last is start or end, for
      every line box contained completely within the element, the alignment is
      determined by the paragraph level of the containing UBA paragraph that
      got its paragraph level from this element. *Is this a reasonable
      formulation? If we get rid of this bullet, the direction can be
      per-paragraph, but alignment can't.*

Aharon

On Tue, Sep 14, 2010 at 5:39 PM, Aharon (Vladimir) Lanin
<aharon@google.com>wrote:

> I second Mati's proposal. As Ehsan has pointed out <textarea readonly> can
> be used to display plain text (as opposed to edit it), even though somewhat
> awkwardly. And if we work out a bigger proposal, we can tweak the bugs filed
> on HTML5 and CSS3. Right now it is imperative to produce a new draft of the
> proposal and file the bugs (October 1 deadline for HTML5..., apparently).
>
> I do want to put dir=uba under the dir=auto umbrella, via
> autodirmethod=first-strong|any-rtl|uba. autodirmethod inherits; dir does
> not. The default would probably be first-strong.
>
>  Is this agreeable to everyone? Please respond.
>
> Here is a quick spec for dir=auto:
>
>    - Using dir=auto with autodirmethod=first-strong|any-rtl would:
>       - Make the default value for the ubi attribute ubi (i.e. on), as
>       described elsewhere.
>       - Set the CSS direction to ltr or rtl according to the indicated
>       algorithm.
>       - Invoke the indicated algorithm on the in-order traversal of the
>       descendent text nodes, with the following exceptions:
>          - Text nodes under a descendant element with an explicit dir
>          attribute (including dir=auto).
>          - The part of the text after the first X characters (where the
>          text in nodes excluded above are not part of the count). *Do we
>          need this? If so, what's a good X value? 100?*
>          - Parts of the text between an LRE, RLE, LRO, RLO, and its
>          matching PDF.
>       - The first-strong algorithm returns the direction of the first
>       strong (L, AL, or R) character it encounters. If it does not encounter any,
>       it returns ltr if it encounters any weak ltr characters (EN or AN). If it
>       does not encounter any of those either, it returns the inherited direction.
>       - The any-rtl algorithm returns rtl if it encounters any strong RTL
>       character, or ltr otherwise.
>    - Using dir=auto with autodirmethod=uba would (by default) set
>    unicode-bidi to "uba" and direction according to first-strong. (Note that
>    this includes leaving direction at the inherited value if the content is
>    neutral.)
>    - For elements other than <textarea>, unicode-bidi:uba is treated as
>    unicode-bidi:isolate.
>    - On <textarea>, unicode-bidi:uba means that:
>       - The UBA on the textarea content is invoked specifying only a
>       default paragraph level (in icu4j terminology, either LEVEL_DEFAULT_LTR or
>       LEVEL_DEFAULT_RTL), based on the the element's own direction value as
>       calculated above. (This makes the all-neutral paragraphs use the same
>       direction as the first paragraph that is not all-neutral.)
>       - Each UBA paragraph’s lines’ alignment is determined by the
>       paragraph’s resolved base level when the element's text-align is start or
>       end.
>
>
> Aharon
>
>
> On Tue, Sep 14, 2010 at 2:21 PM, Matitiahu Allouche <matial@il.ibm.com>wrote:
>
>> I don't feel competent enough to find the magic solution for all the
>> questions that dir=uba seems to raise.  This discussion has been going on
>> for a while, and there is real danger that the whole item be shelved if a
>> consensus is not found soon.
>>
>> However, based on the discussion on the list, I think that the following
>> points are more or less agreeable to all:
>> 1) dir=uba is mostly needed for <pre> and <textarea> elements.
>> 2) All or most of the problems are related to using dir=uba with <pre>.
>> 3) There are alternatives to using dir=uba with <pre> for multiple
>> paragraphs, like separating the text in distinct paragraphs.
>> 4) There is no problem related to using dir=uba with <textarea>.
>> 5) There is no other way than dir=uba to achieve paragraph-based direction
>> for <textarea>.
>>
>> Given the above, I am suggesting to at least allow dir=uba for <textarea>,
>> even if its use for other types of elements is postponed or abandoned
>> altogether.
>>
>>
>> Shalom (Regards),  Mati
>>           Bidi Architect
>>           Globalization Center Of Competency - Bidirectional Scripts
>>           IBM Israel
>>           Phone: +972 2 5888802    Fax: +972 2 5870333    Mobile: +972 52
>> 2554160
>>
>
>

Received on Sunday, 26 September 2010 00:02:43 UTC