Re: Updated Working Draft "Best Practices for XML Internationalization" from CE Whitehead on 2007-07-04 (www-international@w3.org from July to September 2007)

From: CE Whitehead <cewcathar@hotmail.com>
Date: Tue, 03 Jul 2007 20:11:55 -0400
To: www-international@w3.org
Message-ID: <BAY114-F3748B66C6187D87B10C717B3030@phx.gbl>

CE Whitehead <cewcathar@hotmail.com>
Date: Mon, 02 Jul 2007 11:36:31 -0400 wrote:

>BP2

>(http://www.w3.org/TR/2007/WD-xml-i18n-bp-20070628/#DevDir)
>HOW TO DO THIS

>the issue for me would be for images:
>when they are 'transcribed' to text by image-to-text software,
>if the images contain right to left text,
>what will happen?  The characters are backwards in the image, everything 
>(so
>what will the image-to-text transcriber do with Hebrew and Arabic 
>characters
>in texts?)

>This is an issue I still want touched on.

I was a bit lost when I wrote the above.

What exaxctly does image-to-text software do?

Does the software wait to get language input from a human (about the 
language)?
It cannot, since languages may be mixed together in documents
(for example, form translations of background checks that one is sending to 
from one country to another:

these always have the original text and the translation next to it;
the two languages are thus mixed together for the length of the form!!  At 
least in Kuwait; fin the case of the Indian translator--who worked for a 
Kuwaiti of course--who translated my Kuwaiti background check from Arabic to 
English!).

So all text must be read in from an image in exaclty the order that the text 
occurs in, in the image.
On the other hand,  I assume that the software has to recognize the various 
characters in order to read them in!

(but the software sure does not take mirror images of characters!  My 
mistake!)

--C. E. Whitehead
cewcathar@hotmail.com

_________________________________________________________________
http://imagine-windowslive.com/hotmail/?locale=en-us&ocid=TXT_TAGHM_migration_HM_mini_pcmag_0507

Received on Wednesday, 4 July 2007 00:12:04 UTC