Re: Schema.org - identifying accessible documents from Madeleine Rothberg on 2015-06-22 (w3c-wai-ig@w3.org from April to June 2015)

From: Madeleine Rothberg <madeleine_rothberg@wgbh.org>
Date: Mon, 22 Jun 2015 21:20:20 +0000
To: WAI Interest Group <w3c-wai-ig@w3.org>
Message-ID: <D1AD8362.10DF9%madeleine_rothberg@wgbh.org>
Chaals and Jutta are aware of the existing standard for metadata on how to
identify the content of documents so that users (or their search tools)
can find accessible materials, but others may not be, so I want to
document it here. The Access for All model was proposed to schema.org but
has not been accepted. Chaals, I understand that you are concerned that
this approach is not compatible with the way search engines process
materials, but I don't yet understand how it is incompatible. I directed a
project which implemented it (admittedly on a smaller scale than the whole
web) which I will describe below.

The first IMS Global Access for All model for accessibility metadata was
published in 2004.[1] The basic approach has not changed since then, which
is to label the basic "access modes" of the resource. Is it text? Is it
visual? Does it include sound? Does it include tactile content? This
allows users to seek resources they can use even if the resource is not
"100% accessible."

Combined with knowing what kinds of accessible alternatives are provided
with the resource, this gives a complete picture. For example, a video has
access modes visual and auditory. If captions are available, that does not
make this a textual resource; it makes it a resource that has a text
alternative to the auditory material. This is important because if a users
prefers resources without text, a video is a good choice. We always advise
that the access modes of alternatives (like captions) not be added to the
list of access modes of the base resource.

The basic access modes are very simple, but refinements have been added
(and more are proposed for schema.org)[2] that give more nuance. For
example, the issue of images of text is long recognized. We need to ensure
that metadata authors code that properly as an image, not as text, because
tools that make text accessible to people with disabilities generally
wen't work directly on images of text. So we have an access mode as a
refinement of "visual" which is "textOnVisual." In the schema.org project
group, we considered additional refinements including chartOnVisual,
mathOnVisual, and musicOnVisual, which can be important to people
searching for those kinds of materials. Using additional metadata to
indicate reading level of a text or other specific details can permit the
complex use cases Chaals described.

At WGBH, we implemented Access for All metadata and preferences in the
Teachers' Domain digital library, which is sadly no longer available for
reasons that are not related to accessibility. I documented the use of
metadata and preferences for producing useful search results in a video
available from the Accessibility Metadata project page. [3]

I welcome more discussion of this topic so we can find a solution that
works for all stakeholders and move it forward in Schema.org and
elsewhere. 

-Madeleine

[1] http://imsglobal.org/accessibility/accmdv1p0/imsaccmd_infov1p0.html
[2] http://www.a11ymetadata.org/the-specification/ scroll down to "The
following properties are under consideration for v. 1.1."
[3] 
http://www.a11ymetadata.org/accessibility-metadata-in-action-at-teachers-do
main/

On 6/21/15 7:23 PM, "chaals@yandex-team.ru" <chaals@yandex-team.ru> wrote:

>- lwatson@, jutta.trevira@
>
>21.06.2015, 22:15, "Jutta Treviranus" <jutta.trevira@gmail.com>:
>>>  On Jun 20, 2015, at 7:19 PM, Léonie Watson
>>><lwatson@paciellogroup.com> wrote:
>>>
>>>>  From: Jutta Treviranus [mailto:jutta.treviranus@utoronto.ca]
>>>>  Sent: 18 June 2015 14:38
>>>>
>>>>  As to the specific example of markup you have asked us to consider:
>>>>>  "you need to be able to understand english-language text and to
>>>>>hear, OR
>>>>>  to be able to understand english language text and see, in order to
>>>>>  effectively use this site". (The underlying use case is a video
>>>>>which has both
>>>>>  audio descriptions, and captions, available as an option in the
>>>>>player, but
>>>>>  making these things up is easy and there are lots of variations).
>>
>>>  Léonie wrote:
>>>  There is usually a way to make things up where metadata is concerned.
>
>I meant that making up examples is easyŠ and we need to think through a
>lot of them if we're going to get this more or less right.
>
>>> There is often a fair amount of flexibility in the way metadata is
>>>interpreted,
>>> and history tells us this has happened already.
>
>We are trying to provide metadata terms that developers, who have to pick
>which ones to use as they mark up their content, can interpret
>consistently. Whenever they don't do that (which is often), the resulting
>data is far less useful.
>
>One of the nice things with the way that schema.org works is that if we
>get it wrong, revising it is a very lightweight process. But we would
>still rather not get it wrong to begin with - with adoption already on
>millions of websites, the price of a mistake will be that a lot of stuff
>will be published following the mistake, and not updated.
>
>>>  Is there something in this proposal that makes it more prone to
>>>mis-use do you think?
>>
>> Jutta¹s response:
>> Yes, please see my second note. This is proposing that you label a
>>resource with the capabilities you need to use the resource, to in a
>>sense warn off anyone that doesn¹t have those capabilities.
>
>That is a fair assessment.
>
>> As I mentioned, if you are someone that is creating resources that are
>>not accessible
>> then you are probably less likely to take the time to add metadata
>>regarding the capabilities required.
>
>That isn't the big risk that I see, since it is perfectly possible for
>third parties to provide the data. There is a motive to get this right,
>which is that it works in the most important sites on the Web today -
>Google, Yandex, Bing, and others. There is a motive to get it right -
>spamming search engines is a well-known way to drop significantly in
>where you get ranked.
>
>The big risk I see is that people who are subject to various legal
>requirements might not want to provide honest data. For example, is there
>a risk that a national library which uses schema.org will not publish
>data on the accessibility of their material, for fear that it will lead
>to litigation? Certainly there are libraries using schema.org data today
>with the expertise to accurately asses the material they catalogue, and
>enhance its availability by adding high-quality metadata.
>
>> Worse yet, you may think you have done everything you need to do to
>>address accessibility by warning away people that can¹t use the resource.
>
>That is a risk. 
>
>I think it is mitigated by the message that search results will depend on
>what you provide. Like Mobile-friendly content, which has for years been
>considered important, and has recently been made even more important in
>search algorithms, you may decide that search optimisation is important
>even if you don't actually care about accessibility for its own sake.
>
>I would expect a mid-level education provider to realise pretty fast that
>if they want to compete - and many do - with their larger peers in a
>particular field, the "edge" of extra discoverability that enhancing
>accessibility provides is valuable. Instead of saying "oh, we can't make
>this universally accessible" and giving up, they have a motivation to
>enhance whichever aspects of accessibility they can.
>
>>>>  What does your description add that simply stating that this
>>>>resource or
>>>>  document "has captions," "has descriptions", and that the language
>>>>"is
>>>>  english" doesn¹t achieve? In fact your description is much more
>>>>complex and
>>>>  also confusing because you would not need to be able to understand
>>>>english
>>>>  language text if you can hear, you could listen to the speech audio
>>>>and
>>>>  understand spoken english. (I have left your message below my reply
>>>>in its
>>>>  entirety for easier reference.)
>>>
>>>  Léonie wrote:
>>>  If a resource had no specific accessibility feature, yet was
>>>nonetheless accessible to someone, how would it be classified? As
>>>Chaals notes, a search for a resource with text descriptions for images
>>>would exclude any resource that consisted only of text, yet such
>>>resources would likely be accessible to a person requiring text
>>>descriptions for images.
>>
>> Jutta¹s response:
>> That is already covered by identifying that it is text. You do not need
>>any further identification unless there are images as well.
>
>Sure. The devils are of course in the details. Is an Asterix comic
>readable as text? Are the pictures critical, or just enhancements? If a
>document has an animated image, without any text equivalent, does that
>actually mean you can't understand the document if you can't see the
>image? A story, made for the web, that incorporates sound effects and
>pictures may be nonsensical without those effects - or may make perfect
>sense if you can EITHER read the text, OR see the pictures, with the
>sound effects being more or less an irrelevant gimmick. (My mental
>example is actually a recent very successful book for kids, but people
>have been making multimedia things directly for a few decades now, and
>while schema.org is widely used to find physical objects it is ideal for
>finding things that are actually onlineŠ)
>
>Asterix generally *can* be read as text, but is funnier with description
>of some of the images. On the other hand, having to wade through a
>description of every frame would not be an improvement for most readers
>who need descriptions of images.
>
>So the concern is how to develop a vocabulary that deals with these
>situations, and can be extended in granularity.
>
>Another concern is that when we provide vocabulary to describe
>accessibility of libraries, restaurants, public toilets, train stations,
>and so on, it would be helpful if we don't have a completely different
>model.
>
>[... stuff about how ISO works is irrelevant to schema.org - we have an
>open process, and this thread occurred because we can ask for input
>wherever we think we will find it]
>
>cheers
>
>Chaals
>
>--
>Charles McCathie Nevile - web standards - CTO Office, Yandex
>chaals@yandex-team.ru - - - Find more at http://yandex.com
>
Received on Monday, 22 June 2015 21:20:56 UTC