W3C home > Mailing lists > Public > public-esw-thes@w3.org > May 2011

Re: Arabic or Hebrew languages (Right to Left Languages) and SKOS, XML,RDF,etc.

From: diego ferreyra <tematres@r020.com.ar>
Date: Thu, 26 May 2011 12:45:24 -0300
Message-ID: <BANLkTinGHJxWp4v6et4xutF5JQZRxvf0RA@mail.gmail.com>
To: Christophe Dupriez <christophe.dupriez@destin.be>
Cc: public-esw-thes@w3.org, ishida@w3.org
Hi, in TemaTres [1], in the case of a multilingual vocabulary with
federated management worflow, it was necessary to add the label
indication for the special case of Hebrew version[2], but not in the
translated view of others languages  [3].

best regards


diego ferreyra

[1]: http://www.vocabularyserver.com
[2]: http://www.vocabularyserver.com/lre/iw/index.php?tema=683
[3]: http://vocabularyserver.com/lre/en/index.php?tema=683


2011/5/26 Christophe Dupriez <christophe.dupriez@destin.be>:
> Hi again to all of you: thank you for the hints!
>
> What exactly happens:
> 1) xml:lang attribute declares the user language targeted by a given XML
> literal (this in XML, RDF or SKOS)
> 2) Unicode characters are carrying by themselves (in their definition) the
> "script" and the direction they must be written with.
> 3) You can find latin words (written Left to Right) in Arabic texts (or
> Chinese texts or Hebrew texts or Thaï texts...) and vice-versa
> 4) It is a practical issue I have: the browsers (and the text editors like
> Notepad) are not taking the good direction if they are not told to change
> direction.
>
> I consider (4) is a browser "bug": sooner or later, browsers will adapt the
> default direction and default alignment (left or right align) by themselves
> depending on the Unicode characters encountered in the text written inside a
> block.
>
> The short term solution ("browser adaptation") may be to check all
> characters (first characters may have only "weak" directionality and Arabic
> words can be hidden in a latin text) to check if they is Arabic or Hebrew
> inside. Then to add a Unicode markup to signal RTL text within the literal.
>
> Left or right alignment? I am wondering if this should not be decided based
> on the target user language rather than on the characters' script.
>
> Do you agree with this approach (pure data, character sniffing before output
> to add RTL where necessary for current browsers, left/right alignment based
> on xml:lang) ?
>
> Have a nice day!
>
> Christophe
>
>
>
> Le 26/05/2011 16:05, Thad Guidry a écrit :
>
> Oops, forgot to include the good tutorial that I have used in the
> past: http://www.w3.org/International/tutorials/bidi-xhtml/
>
> On Thu, May 26, 2011 at 9:01 AM, Thad Guidry <thadguidry@gmail.com> wrote:
>>
>> Christophe,
>> I personally do not think SKOS or any other structured format should
>> concern itself with display and presentation, especially adding control
>> chars within the data itself [1].   Display and presentation of data should
>> be left to the browser application itself, and the markup handling.
>> 1. http://www.w3.org/TR/i18n-html-tech-bidi/
>>
>> On Thu, May 26, 2011 at 4:38 AM, Christophe Dupriez
>> <christophe.dupriez@destin.be> wrote:
>>>
>>> Hi!
>>>
>>> I would like to know if some best practices has been set up to support
>>> RTL (right to left) languages in XML, RDF or SKOS.
>>>
>>> The problem: when displaying Arabic or Hebrew, the browsers must be told
>>> to write from right to left and (ideally) the text is better displayed
>>> aligned on the right rather than the left.
>>>
>>> One may wish that applications not be obliged to make explicit tests like
>>> "if language is Arabic or Hebrew then RTL+align:right else then
>>> LTR+align:left".
>>>
>>> What have been done for this? What the community think that should be
>>> done?
>>>
>>> I made a test by hand to prepare addition of Arabic to JITA:
>>> http://www.askosi.org/JITA-ar.htm
>>>
>>> Other languages of the JITA thesaurus, as used to access E-LIS (click on
>>> concepts in schemas):
>>> http://www.askosi.org/jita
>>>
>>> For now, my "feeling" is to add Unicode character x202B before Arabic and
>>> Hebrew labels and Unicode character x202C at the end (i.e. within the data).
>>> Character x202C is Pop Direction Format: return to the direction (LTR or
>>> RTL) in use when x202B (switch to RTL) was encountered.
>>>
>>> But what others do???
>>>
>>> I will be happy to learn about your thought on this topic!
>>>
>>> Christophe
>>>
>>
>>
>>
>> --
>> -Thad
>> http://www.freebase.com/view/en/thad_guidry
>
>
>
> --
> -Thad
> http://www.freebase.com/view/en/thad_guidry
>
>
Received on Thursday, 26 May 2011 15:45:54 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 May 2011 15:45:55 GMT