- From: Tomer Mahlin <TOMERM@il.ibm.com>
- Date: Sun, 21 Jun 2015 22:29:30 +0300
- To: Richard Ishida <ishida@w3.org>
- Cc: www International <www-international@w3.org>
- Message-ID: <OFC5E0EEDD.35611948-ONC2257E6B.0065BEE2-C2257E6B.006B12F9@il.ibm.com>
Hello Ri, Very interesting article. Thanks for sharing. Several observations with your kind permission. 1. Identification of proper direction of text. If we manage to accurately identify the language to which text belongs we would easily find out optimal text direction . Getting this information (natural text direction for given language) is available in many services ( including iOS API and ICU ). Please observe that identification of language of text is not the same as identification of code page or even language of each specific word. Several languages share to a large degree the same alphabet (i.e. Arabic / Farsi, Russian / Bulgarian etc.) In general I believe we all agree that even first strong (aka contextual or auto) is just a heuristic. In other words, it is not accurate identification of language to which text (sentence) belong. Much more accurate identification is provided by Language Identification service (i.e. Cortana from Microsoft, Siri from Apple, Google Language tools from Google, Watson from IBM). I realize that at this moment those services might be too computationally expensive to be leveraged at the text rendering level. However, it may change in the future. 2. Specification of text direction at authoring time I think that specification of text direction at authoring time via metadata should be improved in general. Very good support is already provided in various rich text editors (i.e. http://www.tinymce.com/ or http://ckeditor.com/) - the assumption is presence of built in capability to store metadata along side with text data.. However, as opposed to rich text editors, simple plain text input fields (HTML input fields) don't allow any such mechanism. Various browsers (IE, FF) and native platforms (including iOS, Windows) allow changing text direction on the fly. However, there is no way for consuming code to retrieve such information from the input field (even less a way to store it with the text). I believe ideally, JS string object should support getTextDirection and setTextDirection methods (just like other methods it currently supports- http://www.w3schools.com/jsref/jsref_obj_string.asp). At authoring time in HTML input field, when end user interactively changes direction of text, setTextDirection method could be used to store this information. 3. Usage of UCC as metadata for specification of text direction I don't think in general that usage of Unicode Control Character (UCC) is an optimal way to specify the direction of text. It may be appropriate only if this is done "on the glass" so to speak for display purposes only. However, if UCC are injected into string itself this will have far reaching consequences. This is simply because all consuming sides (storage, string manipulation, search etc.) should be aware of the fact that on the authoring side UCC were added as metadata (not as part of the string data). If one controls both authoring and consuming side it is probably doable. Otherwise it is not. 4. Proper display of tokens sequence separated by comma in CSV file This case belongs to so called "structured text" category (to the same category belongs file path, URL, breadcrumb, list of tags separated by comma etc.). They are described in more details in this design document: https://docs.google.com/document/d/1y9LhT7rbGGVHjh2uqTAYHzN5PfbAkPxO5sMJygOPc3I/edit?usp=sharing It is suggested to address proper display of such patterns by a higher level protocol. More details are in the document above. I also attach hereafter just in case: PS. I apologize for extending this discussion into directions which go beyond "cell" context. Nevertheless, I would highly appreciate your thoughts / feedback. Best Regards, Tomer Mahlin Bidi Champion, Globalization Manager, GCoC Bidi Development Lab Phone: +972-2-6491784 | Mobile: +972-54-3368122 E-mail: tomerm@il.ibm.com IBM R&D Labs Malcha Technology Park Jerusalem 96951 Israel From: Richard Ishida <ishida@w3.org> To: www International <www-international@w3.org> Date: 19/06/2015 16:32 Subject: determining cell text direction to help resolve this github issue https://github.com/w3c/csvw/issues/620 i have begun making notes about handling bidi in plain text environments on the Web – particularly, so far, CSV data. it's just a start. Comments welcome, as i go. ri
Attachments
- image/gif attachment: 01-part
- image/gif attachment: 02-part
- application/octet-stream attachment: Properdisplayofbidirectionalstructuredtext.pdf
Received on Sunday, 21 June 2015 19:30:19 UTC