RE: Arabic UNicode in XML

When I access it using IE 5.5, it is marked as UTF-8, and the Arabic text is
real text, although ironically despite the utf-8 encoding it uses NCRs for
the Arabic:
به نظر مى رسد
سازمان
قضايى اسلام
قصد دارد با
اين تمهيد - به
علت عدم حضور
شاکى خصوصى -
متهمان به

Perhaps the server serves up images if it does not detect IE5.x as the
browser?

Chris


-----Original Message-----
From: Chris Lilley [mailto:chris@w3.org]
Sent: Tuesday, December 19, 2000 11:08 PM
To: Saeed Darya
Cc: Matthias Wassermann; www-international@w3.org
Subject: Re: Arabic UNicode in XML


Saeed Darya wrote:
>
> We developed a site that uses unicode for Arabic/Persian in XML format
> (http://www.pardaily.com). The only problem we had was with the supporting
> platforms. For all practical reasons, IE5+ is the only platform that we
> could find displaying the R2L portion properly.  Our client uses Word2000
as
> an editor for Arabic/Persian news files.

I tried that site, but all of the arabic text seemed to be images. only the
latin text was actually text. for example:


<IMG SRC=f03c110.gif BORDER=0 WIDTH=7 HEIGHT=20 ALIGN=middle><IMG
SRC=f03c129.gif BORDER=0 WIDTH=8 HEIGHT=20 ALIGN=middle>  <IMG
SRC=f03c66.gif BORDER=0 WIDTH=3 HEIGHT=20 ALIGN=middle><IMG SRC=f03c110.gif
BORDER=0 WIDTH=7 HEIGHT=20 ALIGN=middle><IMG SRC=f03c192.gif BORDER=0
WIDTH=6 HEIGHT=20 ALIGN=middle><IMG SRC=f03c76.gif BORDER=0 WIDTH=6
HEIGHT=20 ALIGN=middle>  <IMG SRC=f03c189.gif BORDER=0 WIDTH=6 HEIGHT=20
ALIGN=middle><IMG SRC=f03c150.gif BORDER=0 WIDTH=8 HEIGHT=20
ALIGN=middle><IMG SRC=f03c192.gif BORDER=0 WIDTH=6 HEIGHT=20
ALIGN=middle><IMG SRC=f03c129.gif BORDER=0 WIDTH=8 HEIGHT=20 ALIGN=middle>
<IMG SRC=f03c110.gif BORDER=0 WIDTH=7 HEIGHT=20 ALIGN=middle><IMG
SRC=f03c188.gif BORDER=0 WIDTH=7 HEIGHT=20 ALIGN=middle><IMG SRC=f03c81.gif
BORDER=0 WIDTH=6 HEIGHT=20 ALIGN=middle><IMG SRC=f03c89.gif BORDER=0
WIDTH=8 HEIGHT=20 ALIGN=middle><IMG SRC=f03c177.gif BORDER=0 WIDTH=6
HEIGHT=20 ALIGN=middle> 

and so forth. This is not xml either, as the attributes are unquoted so its
not well formed.

Also, it didn't seem to be using unicode, but iso latin-1:

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

Am I missing something here?

--
Chris

Received on Thursday, 21 December 2000 22:33:28 UTC