- From: Jony Rosenne <rosennej@qsm.co.il>
- Date: Wed, 24 Aug 2005 15:02:11 +0200
- To: <www-international@w3.org>
Where the text is long enough, a separate documnet linked to from the main document is in order. For Hebrew, the situation is a little simpler: In the general case it is not possible to convert visual to logical automatically. Jony > -----Original Message----- > From: Tex Texin [mailto:tex@xencraft.com] > Sent: Wednesday, August 24, 2005 1:58 PM > To: Frank Yung-Fong Tang > Cc: Jony Rosenne; www-international@w3.org > Subject: Re: New article for REVIEW: Upgrading from > language-specific legacy encoding to Unicode encoding > > > I was going to make more or less the same comment, which is > that conversion > from legacy encodings to unicode is a difficult but necessary subject. > It is large so should be a separate faq or faqs, and should cover many > encodings, not just bidi. > > Any minute now, Richard is going to pipe up suggesting Joni > submit a faq for > hebrew and Frank one for double-byte encoding conversions, so > I'll preempt > him and suggest that as well. ;-) > > Although we could use a treatise on these issues, I wonder if > it would be > better to identify libraries or tools that do the job right > and give users > appropriate choices. I muck around with iconv, ICU, perl, > etc. and it is > very hard to know which tools will do the entire job > correctly, and which do > the minimum, or are several versions behind. > > For example, a convertor written for Unicode 2.0 would not > take advantage of > the characters in Unicode 4.x. > It is correct in some sense and incorrect in other ways. Also, a pure > encoding convertor would not take into account the needs of > the Web, and > perhaps issues of conversion to the bidi markup. > > And which tools offer a choice when it comes to converting > backslash to yen, > wan, etc. when used as currency? > > Many users are confused by which conversions to use. e.g. When to use > Windows-1252 instead of iso 8859-1, or when to use big5-hkscs > instead of > big-5, since often data is mislabeled? > > I think the tools view or roadmap may be more important than > the character > encoding details. > > But yes, it is a topic definitely needing expansion. > -- > ------------------------------------------------------------- > Tex Texin cell: +1 781 789 1898 mailto:Tex@XenCraft.com > Xen Master http://www.i18nGuy.com > > XenCraft http://www.XenCraft.com > Making e-Business Work Around the World > ------------------------------------------------------------- > > >
Received on Wednesday, 24 August 2005 12:03:28 UTC