Re: Schema.org - identifying accessible documents from chaals@yandex-team.ru on 2015-06-21 (w3c-wai-ig@w3.org from April to June 2015)

From: <chaals@yandex-team.ru>
Date: Mon, 22 Jun 2015 01:23:01 +0200
To: WAI Interest Group <w3c-wai-ig@w3.org>
Message-Id: <77421434928981@webcorp01e.yandex-team.ru>
- lwatson@, jutta.trevira@

21.06.2015, 22:15, "Jutta Treviranus" <jutta.trevira@gmail.com>:
>>  On Jun 20, 2015, at 7:19 PM, Léonie Watson <lwatson@paciellogroup.com> wrote:
>>
>>>  From: Jutta Treviranus [mailto:jutta.treviranus@utoronto.ca]
>>>  Sent: 18 June 2015 14:38
>>>
>>>  As to the specific example of markup you have asked us to consider:
>>>>  "you need to be able to understand english-language text and to hear, OR
>>>>  to be able to understand english language text and see, in order to
>>>>  effectively use this site". (The underlying use case is a video which has both
>>>>  audio descriptions, and captions, available as an option in the player, but
>>>>  making these things up is easy and there are lots of variations).
>
>>  Léonie wrote:
>>  There is usually a way to make things up where metadata is concerned.

I meant that making up examples is easy… and we need to think through a lot of them if we're going to get this more or less right.

>> There is often a fair amount of flexibility in the way metadata is interpreted,
>> and history tells us this has happened already.

We are trying to provide metadata terms that developers, who have to pick which ones to use as they mark up their content, can interpret consistently. Whenever they don't do that (which is often), the resulting data is far less useful.

One of the nice things with the way that schema.org works is that if we get it wrong, revising it is a very lightweight process. But we would still rather not get it wrong to begin with - with adoption already on millions of websites, the price of a mistake will be that a lot of stuff will be published following the mistake, and not updated.

>>  Is there something in this proposal that makes it more prone to mis-use do you think?
>
> Jutta’s response:
> Yes, please see my second note. This is proposing that you label a resource with the capabilities you need to use the resource, to in a sense warn off anyone that doesn’t have those capabilities.

That is a fair assessment.

> As I mentioned, if you are someone that is creating resources that are not accessible
> then you are probably less likely to take the time to add metadata regarding the capabilities required.

That isn't the big risk that I see, since it is perfectly possible for third parties to provide the data. There is a motive to get this right, which is that it works in the most important sites on the Web today - Google, Yandex, Bing, and others. There is a motive to get it right - spamming search engines is a well-known way to drop significantly in where you get ranked.

The big risk I see is that people who are subject to various legal requirements might not want to provide honest data. For example, is there a risk that a national library which uses schema.org will not publish data on the accessibility of their material, for fear that it will lead to litigation? Certainly there are libraries using schema.org data today with the expertise to accurately asses the material they catalogue, and enhance its availability by adding high-quality metadata.

> Worse yet, you may think you have done everything you need to do to address accessibility by warning away people that can’t use the resource.

That is a risk. 

I think it is mitigated by the message that search results will depend on what you provide. Like Mobile-friendly content, which has for years been considered important, and has recently been made even more important in search algorithms, you may decide that search optimisation is important even if you don't actually care about accessibility for its own sake.

I would expect a mid-level education provider to realise pretty fast that if they want to compete - and many do - with their larger peers in a particular field, the "edge" of extra discoverability that enhancing accessibility provides is valuable. Instead of saying "oh, we can't make this universally accessible" and giving up, they have a motivation to enhance whichever aspects of accessibility they can.

>>>  What does your description add that simply stating that this resource or
>>>  document "has captions," "has descriptions", and that the language "is
>>>  english" doesn’t achieve? In fact your description is much more complex and
>>>  also confusing because you would not need to be able to understand english
>>>  language text if you can hear, you could listen to the speech audio and
>>>  understand spoken english. (I have left your message below my reply in its
>>>  entirety for easier reference.)
>>
>>  Léonie wrote:
>>  If a resource had no specific accessibility feature, yet was nonetheless accessible to someone, how would it be classified? As Chaals notes, a search for a resource with text descriptions for images would exclude any resource that consisted only of text, yet such resources would likely be accessible to a person requiring text descriptions for images.
>
> Jutta’s response:
> That is already covered by identifying that it is text. You do not need any further identification unless there are images as well.

Sure. The devils are of course in the details. Is an Asterix comic readable as text? Are the pictures critical, or just enhancements? If a document has an animated image, without any text equivalent, does that actually mean you can't understand the document if you can't see the image? A story, made for the web, that incorporates sound effects and pictures may be nonsensical without those effects - or may make perfect sense if you can EITHER read the text, OR see the pictures, with the sound effects being more or less an irrelevant gimmick. (My mental example is actually a recent very successful book for kids, but people have been making multimedia things directly for a few decades now, and while schema.org is widely used to find physical objects it is ideal for finding things that are actually online…)

Asterix generally *can* be read as text, but is funnier with description of some of the images. On the other hand, having to wade through a description of every frame would not be an improvement for most readers who need descriptions of images. 

So the concern is how to develop a vocabulary that deals with these situations, and can be extended in granularity.

Another concern is that when we provide vocabulary to describe accessibility of libraries, restaurants, public toilets, train stations, and so on, it would be helpful if we don't have a completely different model.

[... stuff about how ISO works is irrelevant to schema.org - we have an open process, and this thread occurred because we can ask for input wherever we think we will find it]

cheers

Chaals

--
Charles McCathie Nevile - web standards - CTO Office, Yandex
chaals@yandex-team.ru - - - Find more at http://yandex.com
Received on Sunday, 21 June 2015 23:23:33 UTC