W3C home > Mailing lists > Public > public-i18n-geo@w3.org > March 2003

bidi

From: Tex Texin <tex@i18nguy.com>
Date: Fri, 14 Mar 2003 15:29:19 -0500
Message-ID: <3E723B9F.46DE5867@i18nguy.com>
To: GEO <public-i18n-geo@w3.org>

I have begun working on the bidi section, and have a rough idea of what I need
to cover.

I wrote this into and would like feedback on it, while I munge the outline
further.

My goal was to identify what was relevant about setting direction at a high
level without getting bogged down in the unicode bidi algorithm nitty
grittys...

I think 3 things are affected by the setting of direction, and one item which
people might expect is independent (alignment).
Anyway, is the following understandable? Is it a reasonable 50,000 foot
summary of the bidi algorithm without doing it an injustice and without
sucking in the other 500 definitions and details?
After you guys look at it, I'll send it to the ig list.
tex


Some languages, such as Arabic and Hebrew,  are written right-to-left. Often,
these languages also incorporate text that is written left-to-right so they
are known as bidirectional. Before creating documents that include
right-to-left text you will want to consider the following:

1) Right-to-left readers scan and process documents differently from
left-to-right readers. Therefore, graphics, forms and overall document layout,
which have been designed for LTR readers, will typically need to be modified
to be correctly understood and usable by  RTL readers. Based on the preferred
reading direction of your target audience and your content, you should decide
whether the entire document should have a RTL direction, or just certain
sections or spans of text. This decision will then help you decide the
document layout and whether any graphics, forms, or sections of the document
need modifications for RTL readers. It may be helpful, when mixing sizable
portions of both LTR and RTL text to section the document into "bands" of
text, where each band is primarily of one directionality.

2) What does it mean to set direction?
Setting direction establishes the initial state of text flow. This setting
affects text rendering and in particular determines: 1) layout of block
elements, 2) the relative positioning of runs of text, where direction changes
from run to run, and 3) the placement of certain Unicode characters. Note that
text alignment is separate from direction and must be set independently in
HTML.

Layout
At the highest levels of rendering a page, the flow direction establishes
whether table and block elements are filled in from the right or the left.

Text runs
Text is rendered using the Unicode bidirectional algorithm. The algorithm
processes text paragraph by paragraph. Within each paragraph, text is
segmented into runs with the same direction. The state of direction is
established by character properties or by specific Unicode characters that
explicitly set direction. Proceeding within a paragraph from run to run, each
direction change establishes an additional "direction embedding level". The
text runs are positioned relative to each other by the algorithm. Changing the
initial state of text flow can change the position of the runs.

Neutral Unicode Characters
Unicode characters have a direction property. Most spacing characters have the
property of either "Strong LTR" or "Strong RTL" direction. Non-spacing
characters follow the direction of their base character. The bidirectional
algorithm renders these characters  according to their directionality. The
majority of Unicode characters have a strong direction and therefore are
rendered based on their property value. Other characters have weak or neutral
property values. Weak characters are rendered according to their proximity to
characters with "strong" directionality. Neutral characters are rendered by
either their proximity to characters with "strong" directionality  or the
current embedding level's direction. If the initial state of text flow is
altered, the position of neutral characters may be affected.

tex
Received on Friday, 14 March 2003 15:30:39 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:03:03 UTC