W3C home > Mailing lists > Public > w3c-wai-ig@w3.org > January to March 2012

Re: Removing PDFs and accessibility

From: Ginger Claassen <ginger.claassen@gmx.de>
Date: Tue, 27 Mar 2012 07:41:54 +0200
Message-ID: <4F715322.3090601@gmx.de>
To: accessys@smart.net
CC: Andrew Kirkpatrick <akirkpat@adobe.com>, "wed@csulb.edu" <wed@csulb.edu>, David Woolley <forums@david-woolley.me.uk>, "w3c-wai-ig@w3.org" <w3c-wai-ig@w3.org>
Dear all,

Where we just talk about PDF files - maybe you can help me with some 
small but annoying thing. Sometimes I have PDF files with pictures 
inside which cannot be recognized by OmniPage since the picture format 
is not recognized. Does anyone here has an idea what kind of pictures 
OmniPage can or cannot recognize in PDF files?

Thanks a lot for your help!

Solong

     Ginger


On 26.03.2012 18:48, accessys@smart.net wrote:
>
>
> and if one is not using windows or mac what then, Adobe has minimal
> support at best for non proprietary operating systems or systems that
> are "different" in the USA the document has to be able to be read by a
> non proprietary or if proprietary it must be freely avaliable at no
> cost... the basic adobe reader avaliable for other operating systems is
> crude at best.
>
> screen readers such as LSR-Gnome, Speakup, ORCA, Emacspeak, etc.
>
>
>
> Bob
>
>
>
>
> On Mon, 26 Mar 2012, Andrew Kirkpatrick wrote:
>
>> Date: Mon, 26 Mar 2012 08:49:26 -0700
>> From: Andrew Kirkpatrick <akirkpat@adobe.com>
>> To: "wed@csulb.edu" <wed@csulb.edu>,
>> David Woolley <forums@david-woolley.me.uk>
>> Cc: "w3c-wai-ig@w3.org" <w3c-wai-ig@w3.org>
>> Subject: RE: Removing PDFs and accessibility
>> Resent-Date: Mon, 26 Mar 2012 15:50:36 +0000
>> Resent-From: w3c-wai-ig@w3.org
>>
>> Unfortunately the original post doesn't allow comments. My gripe with
>> this post is that it makes many false claims and uses the false claims
>> as evidence to support a conclusion which may be true, but there is no
>> actual data or scientific rigor offered, which makes this interesting
>> as anecdotal data, but nothing more. I'd like to see more information
>> on the study performed, and offer the following questions to consider.
>>
>>> From the article, with comments:
>> Mark said major disadvantages of PDFs include:
>> * not showing up in search results
>> PDF documents do show up in search results. Google and Bing both index
>> and include PDF documents in search results.
>>
>> * failing Australian Human Rights Commission requirements for being
>> accessible to people with a disability, such as compatibility with
>> screen readers
>> Differences do exist, to be sure, but NVDA, as a free screen reader on
>> Windows provides nearly the same level of support as JAWS (support for
>> headings is one of the main issues remaining and I expect we'll see
>> that addressed soon). VoiceOver with PDF documents on the Mac is not
>> as good as the Windows options but the document content can be read
>> and used. The level of support is better than what is provided by a
>> text only or RTF document which the AHRC does suggest is sufficient.
>> I realize that this department is in the state government, but it is
>> worth noting that AGIMO in the federal government agrees that
>> well-authored PDF documents can meet WCAG 2.0 and can be used within
>> the government to comply with the National Transition Strategy:
>>
>> (http://agimo.govspace.gov.au/2012/01/12/release-of-wcag-2-0-techniques-for-pdf/comment-page-1/#comment-5632)
>> "As stated, the PDF Sufficient Techniques are now available, so
>> technically an agency can rely on PDF by using the WCAG 2.0 PDF
>> Sufficient Techniques and all applicable General Techniques, and will
>> be considered to be complying with the NTS. This addresses one of the
>> findings of our PDF study by ensuring the design of the PDF file is
>> optimised for accessibility."
>>
>> More on this in a bit...
>>
>> * penalising people who have slow internet connections
>> * often extremely large document sizes.
>> These are really the same point, so I'll address them together. Some
>> PDF documents do get rather large, some outrageously so. However, PDF
>> documents can and should be authored to be as light as possible, so
>> while it may be that a 300 page report is large no matter what an
>> author does, PDF documents in general need not be bloated in size and
>> authors who are tending to their work can easily avoid this. Adobe
>> Acrobat also offers a batch process which can watch a specific folder
>> and when PDF documents are added there it can take the steps to reduce
>> the file size automatically if desired. Others have commented on the
>> convenience of PDF documents for users also, so at a minimum offering
>> a PDF document for some documents can be viewed as helping some users.
>>
>> Back to the main question: Does replacing PDF documents with HTML
>> documents increase web traffic? I don't know the answer, but I am
>> certain that the answer is not as simple as a quick look at the server
>> log data. There are complicated questions to be asked:
>>
>> 1) were the PDF documents that were replaced built as tagged PDF
>> documents to maximize their accessibility?
>> 2) How much of the additional traffic was bots? Give a recent study on
>> the amount of internet traffic that is non-human
>> (http://www.itproportal.com/2012/03/14/51-internet-traffic-non-human/#ixzz1p7FFrR84)
>> and the broad introduction of new pages and links I wonder whether a
>> percentage that is greater than the 51% cited in the Incapsula report
>> because spiders and other bots may be exploring the new pages.
>> (disclaimer - I haven't read the Incapsula report in any depth and
>> can't say whether it is accurate or whether there are reasons that it
>> may not be similar in the Victoria DPI case).
>> 3) What methodology for measuring the results was used? If it is just
>> hits on a page, it might make sense that going from 6000 pages and
>> 9000 PDF files (15K URI) to 22000 HTML pages would result in a larger
>> number of hits. Some quick "back of the envelope" math shows that
>> there are now 1.47 times the number of indexable pages now and the
>> number of hits has risen by a factor of 1.38.
>> 4) Is it possible to review a collection of 10-20 representative PDF
>> documents and the HTML analogs for them and see how the stats for
>> those specific documents break down? That would be interesting.
>>
>> I'm sure that there are other interesting questions, but that's a start.
>>
>> To the question of whether you should take this approach and replace
>> your PDF documents with HTML files - maybe you should, but I'm not
>> convinced that the hit count is a reason that you can depend on. If
>> you are hearing from your users that they prefer HTML files over PDF,
>> then offer HTML. If you are finding that maintenance is easier with
>> another format, use that other format. There are many reasons why you
>> may want to offer HTML documents, but you should also recognize that
>> there are valid reasons for using PDF documents, and if you find that
>> these reasons make sense for you, use PDF. But, when you do use PDF,
>> follow best practices for making sure the PDF documents meet WCAG 2.0.
>>
>> Thanks,
>> AWK
>>
>> Andrew Kirkpatrick
>> Group Product Manager, Accessibility
>> Adobe Systems
>>
>> akirkpat@adobe.com
>> http://twitter.com/awkawk
>> http://blogs.adobe.com/accessibility
>>
>>
>> -----Original Message-----
>> From: Wayne Dick [mailto:wayneedick@gmail.com]
>> Sent: Sunday, March 25, 2012 2:54 PM
>> To: David Woolley
>> Cc: w3c-wai-ig@w3.org
>> Subject: Re: Removing PDFs and accessibility
>>
>> Just making an attempt to move away from PDF as a system to view web
>> content is great move forward. It recognizes the issue that PDF is a
>> poor online reading medium for many people with visual impairments.
>> Thank you Cosmic Muffin.
>>
>> The primary application will be in the area of content meant for
>> reading. When article is written in PDF it generally increases the
>> workload for reading on line, especially for a person with low vision.
>> This generally involves a significant change in workload. Since most
>> sighted people just print PDF articles, this introduces a major
>> inequality of work for people with full sight vs. people with partial
>> sight.
>>
>> The ability to obtain high quality will be the trick. The tag spaces
>> are not isomrphic, and tagged PDF enables meaningful text styling to
>> be embedded in blocks of untagged data. As such I do not see a
>> programatically determined method of translation existing. However a
>> good heuristic will probably suffice.
>>
>> Thanks for the article, good luck Victoria.
>>
>> Wayne Dick
>>
>> On 3/25/12, David Woolley <forums@david-woolley.me.uk> wrote:
>>> David Woolley wrote:
>>>
>>>>
>>>> Incidentally, I have often sought out PDFs because they are not
>>>> fragmented into pages,
>>>
>>> The big problem I often find with lots of small hyperlinked pages, on
>>> sites (typically governmental, or software support) that should be
>>> information rich, is that one ends up going round circles, never
>>> actually getting to the detail you want. I suspect that is often
>>> because that level of detail just does not exist, but unless one maps
>>> out the whole site and proves that you have seen all the pages, one
>>> can never be sure of that.
>>>
>>> A single, linearised, document makes it much easier for the reader to
>>> be sure that information is not present and makes it much harder for
>>> the author to avoid answering difficult questions by just hyperlinking
>>> you backwards and forwards.
>>>
>>>
>>>
>>
>>
>>
>
>
Received on Tuesday, 27 March 2012 05:42:33 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 05:42:36 GMT