RE: PDF - Text Extraction File from chagnon@pubcom.com on 2025-10-17 (w3c-wai-ig@w3.org from October to December 2025)

From: <chagnon@pubcom.com>
Date: Fri, 17 Oct 2025 13:50:40 -0400
To: "'Richter,Susan'" <Susan.Richter@nscc.ca>, <w3c-wai-ig@w3.org>
Message-ID: <025801dc3f8e$8de19770$a9a4c650$@pubcom.com>
Ah, I thought that might be the problem.
You're reading the PDF with a browser rather than a fully-featured PDF
reader program.
 
At this time, I don't know of any browser that will display/process PDFs
seamlessly with any assistive technology. That's because the browser
manufacturers haven't implemented the full PDF and PDF/UA standards. In my
opinion, they are half-baked. And I don't know if any browser will ever get
PDFs fully baked, but time will tell.
 
Even's Adobe's browser plug in, Acrobat Extension, doesn't have
accessibility features, just the regular tools used by standard users. See
https://www.adobe.com/acrobat/pdf-viewer-extension.html 
 
At this time, we recommend that websites advise those using assistive
technologies to download the PDF to their computer and open it with their
preferred PDF viewer/reader program and assistive technology.
 
FYI, I reviewed the PDF map for accessibility, and it totally fails. The PDF
file isn't tagged, and the image doesn't have Alt Text or an alternative
description of the campus map. It's a useless dead graphic for a screen
reader user.
 
We advise our academic clients to develop a better, more functional way to
make their maps accessible. Since the campus is small, a simple HTML webpage
with a narrative description of the campus layout, main features, and a list
of what's in each building would probably suffice. 
 
-Bevi
Bevi Chagnon |  <mailto:bevi.chagnon@PubCom.com> bevi.chagnon@PubCom.com
Member, ISO Committees for PDF & PDF/UA Standards
Adobe Community Expert
Media Designer, Author, Trainer, and Consultant
 
PubCom.com
Technologists for Accessible Design + Publishing
MS Office - Adobe InDesign & Acrobat - Editorial & Design - A11y Publishing
Workflow
 
 
From: Richter,Susan <Susan.Richter@nscc.ca> 
Sent: Friday, October 17, 2025 1:21 PM
To: w3c-wai-ig@w3.org
Subject: RE: PDF - Text Extraction File
 
Thank you all for your responses. 
 
I can share the file with you, since it is live on our site:
https://www.nscc.ca/docs/campuses/ivany/ivany-map.pdf
 
I am viewing it in Chrome with a Windows 11 O/S and using the Jaws screen
reader software.
 
Thanks
Susan
 
Susan Richter
Senior Web Interface Developer
Digital Products & Experience
Nova Scotia Community College
Institute of Technology Campus
Web:
<http://www.nscc.ca/?utm_source=email-sig&utm_medium=email&utm_campaign=emai
l%20signature%20link> nscc.ca

 
From: chagnon@pubcom.com <mailto:chagnon@pubcom.com>  <chagnon@pubcom.com
<mailto:chagnon@pubcom.com> > 
Sent: Friday, October 17, 2025 2:15 PM
To: Richter,Susan <Susan.Richter@nscc.ca <mailto:Susan.Richter@nscc.ca> >;
w3c-wai-ig@w3.org <mailto:w3c-wai-ig@w3.org> 
Subject: RE: PDF - Text Extraction File
 

 
CAUTION: This message was sent from outside the organization. Please do not
click links or open attachments unless you recognize the source of this
email and know the content is safe. 
  _____  

Hi Susan,
I've been a PDF developer since the first beta of Acrobat & the PDF file
format. I've never seen the exact error message you quoted, and have no idea
what a "text extraction file" is.
 
Generally, screen readers pick up the live text in the PDF file itself. But
since you mentioned that these are campus maps, they could be just images
without any live text such as titles, headers, footers, etc. Usually in that
case, other messages will appear that basically prompt to make the file
accessible with their A I tools.
 
The PDF Standards ISO 32000 and PDF/UA 14289 do define that text must be
extractable by technologies, including screen readers. But it does not
reference the term "text extraction file." And there isn't an additional
file: everything should be inside the one PDF file.
 
Questions:
1.	What specific screen reader or readers are you using?
2.	What operating system?
3.	What software are you trying to open and read the PDF with?  Adobe
reader, Adobe Acrobat, FoxIt, or any of the 100s of other programs that can
now open and read PDFs?  Or is the PDF being opened by the web browser?
 
-Bevi
Bevi Chagnon |  <mailto:bevi.chagnon@PubCom.com> bevi.chagnon@PubCom.com
Member, ISO Committees for PDF & PDF/UA Standards
Adobe Community Expert
Media Designer, Author, Trainer, and Consultant
 
PubCom.com
Technologists for Accessible Design + Publishing
MS Office - Adobe InDesign & Acrobat - Editorial & Design - A11y Publishing
Workflow
 
From: Richter,Susan <Susan.Richter@nscc.ca <mailto:Susan.Richter@nscc.ca> > 
Sent: Wednesday, October 15, 2025 7:16 AM
To: w3c-wai-ig@w3.org <mailto:w3c-wai-ig@w3.org> 
Subject: PDF - Text Extraction File
 
Hi All,
 
This is my first time posting here so please advise if this isn't the best
group for this kind of assistance.
 
I've discovered we have some PDFs on our site that when accessed via a
screen reader this is the message read out: "This PDF is inaccessible.
Couldn't download text extraction files". The PDF(s) in question are large
campus maps.
 
I'm not super familiar with creating accessible PDFs, but I'm trying to
understand how a text extraction file is created and then attached to a PDF
so when a user opens it on the page using a screen reader the text
extraction file is available. Does it have to be a separate link on the page
or is there some way to embed/tie it to the PDF itself that just triggers it
when opened via a screen reader?
 
Thanks in advance.
 
Susan Richter
Senior Web Interface Developer
Digital Products & Experience
Nova Scotia Community College
Institute of Technology Campus
Web:
<http://www.nscc.ca/?utm_source=email-sig&utm_medium=email&utm_campaign=emai
l%20signature%20link> nscc.ca

 





This communication (including any attachments) may contain privileged or
confidential information of Nova Scotia Community College and is intended
for a specific individual. If you are not the intended recipient, you should
delete this communication, including any attachments without reading or
saving them in any manner, and you are hereby notified that any disclosure,
copying, or distribution of this communication, or the taking of any action
based on it, is strictly prohibited.
Attachments

image/png attachment: image001.png
Received on Friday, 17 October 2025 17:50:48 UTC