- From: Christophe Strobbe <christophe.strobbe@esat.kuleuven.be>
- Date: Mon, 12 Mar 2007 16:13:45 +0100
- To: w3c-wai-ig@w3.org
Hi, A clarification on my previous mail, based on a response that was sent off-list: At 12:13 12/03/2007, Christophe Strobbe wrote: >Hi David, > >At 20:01 10/03/2007, David Woolley wrote: >>(...) >> > Option - TIFF Format". The PDF contains the text of the article in >> > the form of scanned images. There are no plain text or HTML-versions >> >>I believe the proper Adobe tools can produce an OCRed underlay for the >>scans. Can you confirm that none has been included. (Note that >>modern PDFs can be flagged as allowing access to the text for >>accessibility, but not for cut and paste.) Actually, most >>vaguely recently published journals are available as proper PDFs, so, >>if they are using scans, rather than PDF rendered to a bitmap, they >>may have very nobbled access to the originals. > >I checked the document properties, which tell me that the "PDF producer" >is not Adobe Acrobat but iText 1.3 (a free PDF library in Java; >see <http://www.lowagie.com/iText/>). >The security tab in document properties says that printing, changing the >document, content copying or extration, and content extraction for >accessibility are allowed. >I ran two such PDF files through the accessibility checker in Adobe Acrobat >Professional 7.0. For each page, it says: "1 image(s) with no alternate text". >The accessbility report also says that the document is not tagged and that >there are 7 text blocks with no language specified. The 7 text blockw with no language specified are in a cover page that does contain text (namely title, author, journal, etc, and the conditions of use). After taking out this cover page, the accessibility checker no longer complains about text blocks with no language specified (i.e. all text has gone). After running OCR, the original images still don't have alt text. After running "Add tags to document" and setting the language in the document properties, the accessibility checker no longer reports problems. Best regards, Christophe -- Christophe Strobbe K.U.Leuven - Departement of Electrical Engineering - Research Group on Document Architectures Kasteelpark Arenberg 10 - 3001 Leuven-Heverlee - BELGIUM tel: +32 16 32 85 51 http://www.docarch.be/ Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
Received on Monday, 12 March 2007 15:13:21 UTC