W3C home > Mailing lists > Public > www-international@w3.org > October to December 2000

RE: Arabic UNicode in XML

From: Chris Pratley <chrispr@microsoft.com>
Date: Thu, 21 Dec 2000 18:56:01 -0800
Message-ID: <76EE387EEA6B5F4CA1CFD5A4BA5229F2B69AA1@RED-MSG-13.redmond.corp.microsoft.com>
To: "'Chris Lilley'" <chris@w3.org>, Saeed Darya <saeed.darya@techno-soft.com>
Cc: Matthias Wassermann <m.wassermann@mai-kg.de>, www-international@w3.org
When I access it using IE 5.5, it is marked as UTF-8, and the Arabic text is
real text, although ironically despite the utf-8 encoding it uses NCRs for
the Arabic:
&#1576;&#1607; &#1606;&#1592;&#1585; &#1605;&#1609; &#1585;&#1587;&#1583;
&#1587;&#1575;&#1586;&#1605;&#1575;&#1606;
&#1602;&#1590;&#1575;&#1610;&#1609; &#1575;&#1587;&#1604;&#1575;&#1605;
&#1602;&#1589;&#1583; &#1583;&#1575;&#1585;&#1583; &#1576;&#1575;
&#1575;&#1610;&#1606; &#1578;&#1605;&#1607;&#1610;&#1583; - &#1576;&#1607;
&#1593;&#1604;&#1578; &#1593;&#1583;&#1605; &#1581;&#1590;&#1608;&#1585;
&#1588;&#1575;&#1705;&#1609; &#1582;&#1589;&#1608;&#1589;&#1609; -
&#1605;&#1578;&#1607;&#1605;&#1575;&#1606; &#1576;&#1607;

Perhaps the server serves up images if it does not detect IE5.x as the
browser?

Chris


-----Original Message-----
From: Chris Lilley [mailto:chris@w3.org]
Sent: Tuesday, December 19, 2000 11:08 PM
To: Saeed Darya
Cc: Matthias Wassermann; www-international@w3.org
Subject: Re: Arabic UNicode in XML


Saeed Darya wrote:
>
> We developed a site that uses unicode for Arabic/Persian in XML format
> (http://www.pardaily.com). The only problem we had was with the supporting
> platforms. For all practical reasons, IE5+ is the only platform that we
> could find displaying the R2L portion properly.  Our client uses Word2000
as
> an editor for Arabic/Persian news files.

I tried that site, but all of the arabic text seemed to be images. only the
latin text was actually text. for example:


<IMG SRC=f03c110.gif BORDER=0 WIDTH=7 HEIGHT=20 ALIGN=middle><IMG
SRC=f03c129.gif BORDER=0 WIDTH=8 HEIGHT=20 ALIGN=middle>  <IMG
SRC=f03c66.gif BORDER=0 WIDTH=3 HEIGHT=20 ALIGN=middle><IMG SRC=f03c110.gif
BORDER=0 WIDTH=7 HEIGHT=20 ALIGN=middle><IMG SRC=f03c192.gif BORDER=0
WIDTH=6 HEIGHT=20 ALIGN=middle><IMG SRC=f03c76.gif BORDER=0 WIDTH=6
HEIGHT=20 ALIGN=middle>  <IMG SRC=f03c189.gif BORDER=0 WIDTH=6 HEIGHT=20
ALIGN=middle><IMG SRC=f03c150.gif BORDER=0 WIDTH=8 HEIGHT=20
ALIGN=middle><IMG SRC=f03c192.gif BORDER=0 WIDTH=6 HEIGHT=20
ALIGN=middle><IMG SRC=f03c129.gif BORDER=0 WIDTH=8 HEIGHT=20 ALIGN=middle>
<IMG SRC=f03c110.gif BORDER=0 WIDTH=7 HEIGHT=20 ALIGN=middle><IMG
SRC=f03c188.gif BORDER=0 WIDTH=7 HEIGHT=20 ALIGN=middle><IMG SRC=f03c81.gif
BORDER=0 WIDTH=6 HEIGHT=20 ALIGN=middle><IMG SRC=f03c89.gif BORDER=0
WIDTH=8 HEIGHT=20 ALIGN=middle><IMG SRC=f03c177.gif BORDER=0 WIDTH=6
HEIGHT=20 ALIGN=middle> 

and so forth. This is not xml either, as the attributes are unquoted so its
not well formed.

Also, it didn't seem to be using unicode, but iso latin-1:

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

Am I missing something here?

--
Chris
Received on Thursday, 21 December 2000 22:33:28 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:55 GMT