W3C home > Mailing lists > Public > www-international@w3.org > October to December 2006

Re: Arabic XML question

From: Matitiahu Allouche <matial@il.ibm.com>
Date: Sun, 22 Oct 2006 10:29:05 +0200
To: "Sandra Bostian" <sbos@loc.gov>
Cc: www-international@w3.org, www-international-request@w3.org
Message-ID: <OFF723718D.C983BD28-ONC225720F.002D4F78-C225720F.002E5EA9@il.ibm.com>

XML processors are not supposed to handle Arabic text (content or names) 
differently from LTR text, so the data should still be (where upper case 
letters represent Arabic):

<ARABICNAME>ARABIC CONTENT</ARABICNAME>

If you look at such a data stream with an XML viewer or editor, results 
may vary depending whether that viewer/editor has special handling for RTL 
text.  Since an XML file with Arabic stuff is likely to contain a mixture 
of LTR and RTL text (both for content and for names), the display will 
often be difficult to interpret visually and to edit, but this does not 
mean that the XML will not be processed correctly by whatever application 
it is meant for.

Shalom (Regards),  Mati
           Bidi Architect
           Globalization Center Of Competency - Bidirectional Scripts
           IBM Israel
           Phone: +972 2 5888802    Fax: +972 2 5870333    Mobile: +972 52 
2554160




"Sandra Bostian" <sbos@loc.gov> 
Sent by: www-international-request@w3.org
20/10/2006 21:41

To
<www-international@w3.org>
cc

Subject
Arabic XML question







I'm working on some training materials and I have a question about Arabic 
usage in XML elements and the order of tags in a bidi environment. 
Normally, in an LTR environment you would get this:

<name>content</name>

I'm assuming the order of start and end tags would remain the same in a 
bidi environment, with both Arabic language content and element names, 
because these are processor rules and they are expecting a particular 
syntax. But I couldn't find anything confirming or disputing this. Can 
anyone confirm or point me to something that would say that this should 
not be the way things are:

</eman>tnetnoc<eman> or <eman/>tnetnoc<eman>

and that it should be:

<eman>tnetnoc</eman>

Thanks,
Sandy



Sandy Bostian
Digital Conversion Specialist
Library of Congress
Meeting of Frontiers: http://frontiers.loc.gov
202-707-2342
sbos@loc.gov
Received on Sunday, 22 October 2006 08:26:34 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:08 GMT