Schema.org - identifying accessible documents from chaals@yandex-team.ru on 2015-06-17 (w3c-wai-ig@w3.org from April to June 2015)

From: <chaals@yandex-team.ru>
Date: Wed, 17 Jun 2015 23:53:51 +0200
To: WAI Interest Group <w3c-wai-ig@w3.org>
Message-Id: <36321434578031@webcorp02d.yandex-team.ru>

Hi folks,

TL;DR: I am looking for opinions on how to identify resources so it is easier for a given person to find content accessible to them.

Details…

you may know http://schema.org - it is now used on about 1/3 of the web that search engines index. Its purpose is to provide information that lets search engines understand better what there is in a page or site. Yes, we are implementing proposals from the mid-1990s.

One of the things we have started at schema.org is adding information to content so people can find stuff that is accessible to *them*. This is a complementary approach to getting people to do a better job of producing content that is more generally accessible, not a replacement. But the point still stands that a game totally unusable for a blind person may be fine for a given person with autism spectrum issues, and a site that cannot be used by a deaf person may nevertheless be quite useful for someone with a certain level of visual disability.

Right now in schema.org we have a few ways to talk about some characteristics of web resources http://schema.org/accessibilityHazard and http://schema.org/accessibilityControl are fairly straightforward - although I will work on better documentation for them, and probably explain them further.

One is that a web page can state whether it has an accessibility hazard, such as flashing - so people who are sensitive to flashing content, such as those with photosensitive epilepsy, some people with autism spectrum issues, and some people who just find it hard to concentrate, can avoid that content.

accessibilityControl is to make it clear that a site can be used e.g. with keyboard-only, or with voice control systems.

We also have http://schema.org/accessibilityFeature which can be used to note that a page has some feature useful for improving accessibiity, such as captions for audio or video, descriptions for images, simple text, etc. There are also features of schema.org that can be used to describe the language of a resource, relationships between various parts, and so on.

The problem is we don't have a good mechanism for describing how you can interact with a resource. Asking for content where the images have descriptions is fine, but irrelevant where there are no images in the first place. And asking developers to explicitly state all the cases that are irrelevant to their content strikes me as unscalable (and not very bright in the first place).

My initial thinking is that we should describe "accessModes" for content, such as "you need to be able to understand english-language text and to hear, OR to be able to understand english language text and see, in order to effectively use this site". (The underlying use case is a video which has both audio descriptions, and captions, available as an option in the player, but making these things up is easy and there are lots of variations).

This would enable us to later refine statements, such as "you need to understand written english to a senior high school level and be able to read 24pt yellow text with black outlines on a changing background at about 80 words per minute, OR you need to understand spoken english at a junior high school level and be able to discern it against a low level of background noise" - but would still make it possible for people to meaningfully work with the statements at the first level of detail., and not fail because most developers didn't provide sufficient detail about their content. At the same time, there is an incentive for being more accurate.

Two arguments have been raised against this approach.

The first is that it is enforcing a "medical model" of disability, rather than allowing people to state their own preferences and needs. As far as I can see this logic is false. The model here allows people to state, in as much or little detail as they want for a given situation, what capabilities they have, and enables search systems to match resources against the particular capabilities or preferences of a particular individual in real time.

The second is that not all capabilities can easily be expressed linearly. This I believe may be true. But as an initial approach it seems that we can describe a lot of things usefully in this way, and that will at least enable us to learn a lot and provide a lot of incentives for a minimal amount of description to be added to resources that will in turn enable real people to find real resources that meet their real needs, rather than letting perfect be the enemy of good.

I am wondering what I have missed, what people think about these approaches, whether from the perspective of a user, a content producer, or a toolmaker (or all of the above, as some are), and what people would like to know more about regarding this aspect of the work done in schema.org

We are aware that we are not very advanced yet. While some schema properties are used on millions or tens of millions of domains, the accessibiltiy properties we have are used on tens or hundreds of domains. Improving awareness and improving what we do might help to raise that substantially, in turn making it easier to find material that work for *you*…

so comments are welcome. Please feel free to forward this message. If replies are not going to come back here (at least in summary), feel free to cc me directly. I'll also make some more space on the relevant wiki pages for further discussion of these issues.

cheers

Chaals

--
Charles McCathie Nevile - web standards - CTO Office, Yandex
chaals@yandex-team.ru - - - Find more at http://yandex.com

Received on Wednesday, 17 June 2015 21:54:23 UTC