RE: Re: The murky intersection of accessibility and internationalization from Andrew Cunningham on 2017-01-10 (w3c-wai-ig@w3.org from January to March 2017)

From: Andrew Cunningham <andj.cunningham@gmail.com>
Date: Tue, 10 Jan 2017 12:43:06 +1100
To: "Sean Murphy (seanmmur)" <seanmmur@cisco.com>
Cc: WAI Interest Group <w3c-wai-ig@w3.org>
Message-ID: <CAOUP6Kn6p7A=gZT2qXUpAPyDS8VSo75SfDWGxDw7K7uG=3r+6w@mail.gmail.com>
On 10 Jan 2017 11:53 AM, "Sean Murphy (seanmmur)" <seanmmur@cisco.com>
wrote:

Would this not fall within the vendor of the browser, PDF viewer or produc
that creates the content or even the vendor OS?




Since WCAG 2.0 isn't more specific, it would devolve to vendor decisions
and choices. But this would mean that what makes an accessible document
more fluid, and potentially means that

1) accessibility conformance is software dependent rather than software
independent,  or
2) it may not be technically possible to create an accessible document, if
the language or character encoding is unsupported by vendors

As I said previously, HTML is an easier format to deal with, other formats
may be either difficult to make accessible or impossible to make accessible
depending on languages and encodings.

For instance I would consider a large proportion of PDF files in languages
other than English on state and federal government  websites in Australia
as not meeting WCAG 2.0 requirements.

Andrew




Sean Murphy

Accessibility Software engineer

seanmmur@cisco.com

Tel: +61 2 8446 7751 <+61%202%208446%207751>       Cisco Systems, Inc.

The Forum 201 Pacific Highway

ST LEONARDS

2065

Australia

cisco.com

 Think before you print.

This email may contain confidential and privileged material for the sole
use of the intended recipient. Any review, use, distribution or disclosure
by others is strictly prohibited. If you are not the intended recipient (or
authorized to receive for the recipient), please contact the sender by
reply email and delete all copies of this message.



*From:* Andrew Cunningham [mailto:andj.cunningham@gmail.com]
*Sent:* Tuesday, 10 January 2017 11:40 AM
*To:* WAI Interest Group <w3c-wai-ig@w3.org>
*Subject:* Fwd: Re: The murky intersection of accessibility and
internationalization



Forgot to reply to the list.



---------- Forwarded message ----------
From: "Andrew Cunningham" <andj.cunningham@gmail.com>
Date: 10 Jan 2017 11:10 AM
Subject: Re: The murky intersection of accessibility and
internationalization
To: <chaals@yandex-team.ru>
Cc:

HI





On 9 Jan 2017 17:10, <chaals@yandex-team.ru> wrote:

Hi Andrew,

I suggest you look at the "understanding 3.1.1" section -
https://www.w3.org/TR/UNDERSTANDING-WCAG20/meaning-doc-lang-id.html

It says, right at the top,

"The intent of this Success Criterion is to ensure that content developers
provide information in the Web page that user agents need to present text
and other linguistic content correctly. Both assistive technologies and
conventional user agents can render text more accurately when the language
of the Web page is identified. Screen readers can load the correct
pronunciation rules. Visual browsers can display characters and scripts
correctly. Media players can show captions correctly. As a result, users
with disabilities will be better able to understand the content."



It isn't a language identification issue, rather it is a character encoding
issue. Even if language is correctly tagged, the problem remains.




If the text itself doesn't match the language, then fails to meet the
intent - i.e. it is not fit for purpose.



This is more likely a failing in PDF and other file formats where ability
to select correct language at the authoring stage is much more limited than
in HTML. And would be an argument for why HTML should be used in preference
to other rich text file formats.



Likewise, in Understanding 1.1.1 - https://www.w3.org/TR/
UNDERSTANDING-WCAG20/text-equiv.html

it says

"The purpose of this guideline is to ensure that all non-text content is
also available in text. "Text" refers to electronic text, not an image of
text. Electronic text has the unique advantage that it is presentation
neutral. That is, it can be rendered visually, auditorily, tactilely, or by
any combination. As a result, information rendered in electronic text can
be presented in whatever form best meets the needs of the user. It can also
be easily enlarged, spoken aloud so that it is easier for people with
reading disabilities to understand, or rendered in whatever tactile form
best meets the needs of a user."

So anything that is written using a visual trick to replace the underlying
characters with other glyphs isn't "text", in the meaning of WCAG, and
requires an alternative. The simplest one for the cases you describe would
of course be proper unicode text…

This issue is also noted in the glossary definition of "non-text content":
https://www.w3.org/TR/UNDERSTANDING-WCAG20/text-equiv-all.html#non-text-
contentdef

But I agree that in terms of Success Criteria this isn't immediately
obvious. Since justifying the jobs of accessibility consultants as the only
people who can understand WCAG isn't a goal, I think it would be good to
think about how we could clarify this in WCAG.



Initially I did think about the text and non-text distinction in WCAG 2.0,
but initially thought that using this would be too radical. But since you
posit it, then it is worth further thought.



I would also argue that this interpretation is obscure enough for many
accessibility specialists to stumble on.



The problem is that WCAG 2.0 does not directly address issues relating to
character encoding. There are no normative requirements for textual
content. For a document to be considered accessible, the character encoding
would need to be identified AND supported by the software in use.



So in theory you need to use a subset of encodings likely to be widely
implemented for a document to be considered accessible, unless you include
a "textual alternative". Essential this comes down to "Use Unicode, or add
a Unicode alternative if required".



It also has interesting implications for PDF. If all glyphs in font can not
be resolved to Unicode codepoints via ToUnicode mapping then the text layer
contains non text content. In such cases ActualText must be added.



Even if Unicode must be used, PDF's, for a wide range of Unicode blocks,
cannot resolve the codepoints into the correct sequence, creating malformed
Unicode sequences. This is an inherent problem of the format.



So for various languages, PDF files must always contain ActualText
attributes.



All the above assuming the definitions of text and non text content in WCAG
2.0



An interesting  aside would be that it is possible to have a file that was
accessible, at a later stage fail to be accessible because software no
longer supports the character encoding used.



For instance, Web browsers over time have supported fewer encodings,
preferring Unicode, but continuing support for key legacy encodings. For
instance at one time there were key browser's that supported a numbet Tamil
and Vietnamese character encodings. Web pages of that Vintage that met WCAG
2.0 requiremen's could be considered accessible. The same document with
modern browsers would have to be considered inaccessible. Has interesting
issues for archiving.



aside …
I worked on an example last century, where a group of aboriginal languages
were written using a font so that various punctuation characters would be
visually represented as the right glyph - but since the underlying word
would have punctuation marks in place of some letters, they could not be
presented by a screen reader or represented accurately in a font designed
for e.g. simplifying reading for people with dyslexia. If I recall
correctly, an added problem was not having language code.



I remember discussing this with you, way back in the past at mtings at
RMIT, if I remember correctly.



Andrew
Received on Tuesday, 10 January 2017 01:43:40 UTC