Re: determining cell text direction

Hello Ri,

    Very interesting article. Thanks for sharing. 
    Several observations with your kind permission.

    1. Identification of proper direction of text. 

    If we manage to accurately identify the language to which text belongs 
we would easily find out optimal text direction . Getting this information 
(natural text direction for given language) is available in many services 
( including iOS API and ICU ). Please observe that identification of 
language of text is not the same as identification of code page or even 
language of each specific word. Several languages share to a large degree 
the same alphabet (i.e. Arabic / Farsi, Russian / Bulgarian etc.)
    In general I believe we all agree that even first strong (aka 
contextual  or auto) is just a heuristic. In other words, it is not 
accurate identification of language to which text (sentence) belong. 
    Much more accurate identification is provided by Language 
Identification service (i.e. Cortana from Microsoft, Siri from Apple, 
Google Language tools from Google, Watson from IBM). 
    I realize that at this moment those services might be too 
computationally expensive to be leveraged at the text rendering level. 
However, it may change in the future. 
 
    2. Specification of text direction at authoring time

   I think that specification of text direction at authoring time via 
metadata should be improved in general. Very good support is already 
provided in various rich text editors (i.e. http://www.tinymce.com/ or  
http://ckeditor.com/) - the assumption is presence of built in capability 
to store metadata along side with text data.. 
    However, as opposed to rich text editors, simple plain text input 
fields (HTML input fields) don't allow any such mechanism. Various 
browsers (IE, FF) and native platforms (including iOS, Windows) allow 
changing text direction on the fly.
    However, there is no way for consuming code to retrieve such 
information from the input field (even less a way to store it with the 
text). 
    I believe ideally, JS string object should support getTextDirection 
and setTextDirection methods (just like other methods it currently 
supports- http://www.w3schools.com/jsref/jsref_obj_string.asp).
    At authoring time in HTML input field, when end user interactively 
changes direction of text, setTextDirection method could be used to store 
this information.
 
   3. Usage of UCC as metadata for specification of text direction

   I don't think in general that usage of Unicode Control Character (UCC) 
is an optimal way to specify the direction of text. It may be appropriate 
only if this is done "on the glass" so to speak for display purposes only. 
 
   However, if UCC are injected into string itself this will have far 
reaching consequences. This is simply because all consuming sides 
(storage, string manipulation, search etc.) should be aware of the fact 
that on the authoring side UCC were added as metadata (not as part of the 
string data). If one controls both authoring and consuming side it is 
probably doable. Otherwise it is not.
 

  4. Proper display of tokens sequence separated by comma in CSV file

    This case belongs to so called "structured text" category (to the same 
category belongs file path, URL, breadcrumb, list of tags separated by 
comma etc.). 
    They are described in more details in this design document: 
https://docs.google.com/document/d/1y9LhT7rbGGVHjh2uqTAYHzN5PfbAkPxO5sMJygOPc3I/edit?usp=sharing

     It is suggested to address proper display of such patterns by a 
higher level protocol. More details are in the document above.
     I also attach hereafter just in case: 



PS. I apologize for extending this discussion into directions which go 
beyond "cell" context. Nevertheless, I would highly appreciate your 
thoughts / feedback. 

Best Regards,

Tomer Mahlin
Bidi Champion, Globalization Manager, GCoC
Bidi Development Lab


Phone: +972-2-6491784 | Mobile: +972-54-3368122 E-mail: tomerm@il.ibm.com

IBM R&D Labs
Malcha Technology Park
Jerusalem 96951
 Israel




From:   Richard Ishida <ishida@w3.org>
To:     www International <www-international@w3.org>
Date:   19/06/2015 16:32
Subject:        determining cell text direction



to help resolve this github issue https://github.com/w3c/csvw/issues/620 
i have begun making notes about handling bidi in plain text environments 
on the Web – particularly, so far, CSV data.

it's just a start.  Comments welcome, as i go.

ri

Received on Sunday, 21 June 2015 19:30:19 UTC