- From: Shawn Medero <smedero@uw.edu>
- Date: Fri, 21 Aug 2009 12:07:06 -0700
- To: Maciej Stachowiak <mjs@apple.com>
- Cc: "public-html@w3.org WG" <public-html@w3.org>, Matt May <mattmay@adobe.com>
On Thu, Aug 20, 2009 at 10:45 PM, Maciej Stachowiak<mjs@apple.com> wrote: > Further, I believe the premise of the objection is false. The objection > categorically says that state-of-the-art image analysis heuristics cannot > recover useful information from an image, "not even close". There exist > optical character recognition algorithms that could recover text from an > image of text with high probability of success. Wearing my "former employee for a Linguistics research lab" hat, I'm going to point out that OCR of a digital image containing Arabic/Bengali text is no where near ready for prime time. Read some of the system evaluations found in academic literature via Citeseer or Google Scholar. Just to be clear, I'm not even talking about handwritten Arabic... just OCR of digital images containing text written in popular Arabic fonts is not stable enough for commercial use. The products claiming to do it only work with one or two fonts and require very clean image sources. In the US, only defense contractors have access to complex systems that can perform (in terms of accuracy and speed) reasonably well on this task. There's a lot of interesting research in this field... but getting to the heart of Matt's point it is not ready for a spec like HTML 5. ---- One question I have is what is "image analysis heuristics" really saying in this section: "User agents may also apply image analysis heuristics to help the user make sense of the image when the user is unable to make direct use of the image, e.g. due to a visual disability or because they are using a text terminal with no graphics capabilities." Was it really referring to OCR-type tasks or something else? I can't imagine it was referring to something like OCR or any type of machine transcription from an image source ... so I'd rather not waste a permathread on what similar technologies can and can't do. It would also be helpful to know if that text added based on behavior found in deployed implementations. -s
Received on Friday, 21 August 2009 19:07:44 UTC