- From: Tex Texin <tex@i18nguy.com>
- Date: Fri, 14 Mar 2003 15:29:19 -0500
- To: GEO <public-i18n-geo@w3.org>
I have begun working on the bidi section, and have a rough idea of what I need to cover. I wrote this into and would like feedback on it, while I munge the outline further. My goal was to identify what was relevant about setting direction at a high level without getting bogged down in the unicode bidi algorithm nitty grittys... I think 3 things are affected by the setting of direction, and one item which people might expect is independent (alignment). Anyway, is the following understandable? Is it a reasonable 50,000 foot summary of the bidi algorithm without doing it an injustice and without sucking in the other 500 definitions and details? After you guys look at it, I'll send it to the ig list. tex Some languages, such as Arabic and Hebrew, are written right-to-left. Often, these languages also incorporate text that is written left-to-right so they are known as bidirectional. Before creating documents that include right-to-left text you will want to consider the following: 1) Right-to-left readers scan and process documents differently from left-to-right readers. Therefore, graphics, forms and overall document layout, which have been designed for LTR readers, will typically need to be modified to be correctly understood and usable by RTL readers. Based on the preferred reading direction of your target audience and your content, you should decide whether the entire document should have a RTL direction, or just certain sections or spans of text. This decision will then help you decide the document layout and whether any graphics, forms, or sections of the document need modifications for RTL readers. It may be helpful, when mixing sizable portions of both LTR and RTL text to section the document into "bands" of text, where each band is primarily of one directionality. 2) What does it mean to set direction? Setting direction establishes the initial state of text flow. This setting affects text rendering and in particular determines: 1) layout of block elements, 2) the relative positioning of runs of text, where direction changes from run to run, and 3) the placement of certain Unicode characters. Note that text alignment is separate from direction and must be set independently in HTML. Layout At the highest levels of rendering a page, the flow direction establishes whether table and block elements are filled in from the right or the left. Text runs Text is rendered using the Unicode bidirectional algorithm. The algorithm processes text paragraph by paragraph. Within each paragraph, text is segmented into runs with the same direction. The state of direction is established by character properties or by specific Unicode characters that explicitly set direction. Proceeding within a paragraph from run to run, each direction change establishes an additional "direction embedding level". The text runs are positioned relative to each other by the algorithm. Changing the initial state of text flow can change the position of the runs. Neutral Unicode Characters Unicode characters have a direction property. Most spacing characters have the property of either "Strong LTR" or "Strong RTL" direction. Non-spacing characters follow the direction of their base character. The bidirectional algorithm renders these characters according to their directionality. The majority of Unicode characters have a strong direction and therefore are rendered based on their property value. Other characters have weak or neutral property values. Weak characters are rendered according to their proximity to characters with "strong" directionality. Neutral characters are rendered by either their proximity to characters with "strong" directionality or the current embedding level's direction. If the initial state of text flow is altered, the position of neutral characters may be affected. tex
Received on Friday, 14 March 2003 15:30:39 UTC