Re: Marking up links to alternative versions of content (was: Re: conflation of issues or convergence of interests?) from Sander Tekelenburg on 2007-08-01 (public-html@w3.org from August 2007)

From: Sander Tekelenburg <st@isoc.nl>
Date: Wed, 1 Aug 2007 23:09:00 +0200
To: public-html@w3.org
Message-Id: <p06240643c2d6870fc339@[192.168.0.101]>
At 10:30 +0300 UTC, on 2007-07-31, Henri Sivonen wrote:

> On Jul 31, 2007, at 06:07, Sander Tekelenburg wrote:

[... <http://lachy.id.au/dev/presentation/future-of-html/>]

>> {frown} The point of the example was that the audio and text are
>> equivalents.
>> If there'd be a need to explain what one contains that another does
>> not, then
>> they are not equivalents.
>
> The premises of my thinking aloud were:
>   * Different versions of a work are rarely *truly* equivalent unless
> the difference is only about video or audio codec.
>   * Prima facie I don't trust a simple declaration on the markup
> level can provide me with the kind of information I need to choose
> from different versions that I am able to consume at least partially.

Reality dictates that not everybody in every situation *can* consume all
equivalents. I'm trying to get us to deal with that reality.

For users in a situation in which they *can* consume all equivalents, UAs
should indeed allow them access to all. For example because they might not
trust the author's claim.

That aside, I don't see how "letting authors suggest a relationship in the
prose" could possibly give you any more certainty that what is claimed to be
an alternative in fact is one.

>   * Prima facie, I trust software performing the selection for me
> based on said labeling even less.

Do you not trust your anti spam filter? Do you not trust your browser to
block pop-ups?

> Overall, I'm very suspicious of the whole concept of truly
> "equivalent" content appearing in practice in such a pure form that
> removing the human choice of the user were a good idea

Exactly. That's why it is being argued that UAs should provide access to all
equivalents.

[...]

> My distrust of the markup labeling and automatic choice comes from
> translations. I become annoyed when a site offers me a translation
> automatically when I know (or can guess) that the original is written
> in a language that I can also read, because from experience I expect
> to be better off reading the original if it is in a language that I
> can read.

Exactly. So it should be possible for you to [1] request a specific default
language and [2] override that when you want another language.

But making people make that choice manually every time is hardly useful when
they cannot read all languages.

> When I've offered alternative file formats
> myself, I have found that explaining the purpose of each alternative
> briefly in natural language is the only way that allows the user at
> the other end make an informed instead of having software pick
> potentially an inappropriate version at semi-random.

HTML should allow authors to provide such prose when they feel it is needed.
But as I pointed out before, very often such prose will distract from the
actual content. So we should not *require* authors to provide such prose --
it would only discourage them from providing equivalents.

> Now, all the above concerns me making choices about alternatives that
> I could consume *to some degree*. In the accessibility case, the
> usual premise is that the user absolutely cannot consume one version
> *at all* and, therefore, needs an "equivalent" alternative. However,
> it isn't as clear-cut as this. I think it was Gregory who pointed out
> that sometimes it is useful to know what was there even if part of it
> is something one cannot consume.

I think we need a more elaborate explanation of that need, because I have the
feeling that that need is a perceived one, based on the limitations of
current screen readers.

I don't see exactly what of an image a blind user would need, when the
textyal alternative is a good equivalent. Possibly it is useful to know that
the text is the equivalent for an image (although that would seem distracting
to me). But UAs can indicate that, provided equivalents are marked up
explicitly as equivalents.

> Moreover, there have been
> suggestions that in some cases pieces of content designed to be
> mutually exclusive alternatives should be presented side-by-side
> nonetheless.

Yeah, it can for instance be useful to consume audio and its transcript
simultaneously. For that particular case, the user could load the audio file
into an extrenal player. Or even have it played by the browser itself, but in
a background window. I do that myself somtimes: load the audio into a
background window and read the transcript along with it.

We'd need to index other specific cases of needing two or more equivalents to
be consumable simlutaneously to decide what that would take. I can imagine
that some cases need nothing extra, while others just might not be solvable
within HTML. I really think that HTML should make it possible for authors to
provide universaility first, and as much accessibility as possible second.
This particular "side by side" issue might be too much to be solved by HTML.
(But let's review the use cases before we decide on that.)

Another approach might be to completely change things. We could add an equiv
boolean and allow @for on section, article, and maybe more: to indicate that
that section is the equivalent for an <img>, <object>, <video, <audio>,
whatever.

<video id="blah"></video>
<figure equiv for="blah">aural equivalent</figure>
<section equiv for="blah">textual equivalent</section>

pros
- allows providing equivalents to be presented alongside each other
- allows equivalents to be visible to all users
- allows equivalents to be identified as such (and yet still easily allow
distrusting users to double-check ;))
- no 'invisible meta data'
- equivalents can be places 'anywhere' within the document and still clearly
indicate their relation

cons:
- requires authors to ensure correct ids (but HTML5 allowing any character
helps)
- needs more to allow for defaulting to one type of equivalent and not
presenting others

That last one would probably require another attribute, to mark the type of
the equivalent:

<video id="blah"></video>
<figure equiv for="blah" type="audio">aural equivalent</figure>
<section equiv for="blah" type="text">textual equivalent</section>

But @type is for MIME types, so not really an option. (I don't see how you'd
indicate a captioned video through a MIME type.)

Without being able to indicate an equivalent's type, all equivalents would
always have to be presented to all users. If the type could be defined,
authors could set all but what they consider the 'main' equivalent to
display:none. UAs would then need to be configurable to override that.

All this could be considered backwards compatible, in that all pre-HTML5 UAs
would present all equivalents. Not sure if it could be authored such that the
relation between equivalents would be detectable by pre-HTML5 UAs.

[...]

>> [1] No program could make use of that.
>> - Not an indexing bot.
>
> Making sense of content without explicit associations is the core
> competence of successful search companies.

Well, let's keep things in perspective. Even Google tells authors that
well-structured semantic mark-up will give them better scores, which authors
can see happen in practice.

Of course there are ways to make some sense out of non-structure, but it'll
always be easier to make sense out of structure. The result will be better,
the work will be easier, the cost lower, etc.

And again, this sounds similar to the argument to look at existing code
without looking at the reason for that code. A lack of structure is no doubt
the only reason that search engines bother to try to make sense of that. (It
might be a reason for a search engine to argue against better structure, but
only when they their 'making sense' technology is making the money.)

>> - Not a tool that helps authors judge the universality/
>> accessibility of their
>> document.
>
> I'm a developer of a checking tool (not specifically for
> accessibility) myself, but I think that features should be designed
> to suit the communication of authors and users first and machine-
> checkability should come second instead of being a design constraint.

Agreed. Authoring should not be made harder just to aide tools that help
authoring :) But such tools *are* to be considered. When we can choose
between two solutions that are equally hard/easy for authors, the most
machine-checkable one should be favoured.

[...]

>> - Not an authoring tool that needs to help the author to not mess
>> up what a
>> previous author carefully added to try to help certain accessibility
>> situations.
>
> One of the advantages of prose and plain links is that a
> collaborating author can perceive them and not mess them up the way
> "invisible metadata" can be easily be messed up.

It requires each author to be fully aware of the purpose of individual bits
of prose. That's yet again a situation in which expliclt mark-up makes that
easier than requiring one author to deduce such meaning from another author's
prose.

[...]

> as a user I don't trust that they are truly equivalent if
> there's more than one thing, so I want to know a word of two about
> what and why.

Why would you have reason to trust that word or two?

[...]

> My point is that a user cannot trust that they indeed are equivalent.
> It is quite reasonable to expect that one version is primary and
> other versions are out-of-date or otherwise second-rate.

That problem applies to all mark-up. You can't be sure what a link points to
until you have followed it. You can't be sure that a video is about what the
author claims, until you've consumed it. It even applies to all content. An
introductory paragraph might claim that the article is about improving user's
Web experience, yet the article may turn out to be an argument for Flash-only
sites. Prose may say "You can also consume the transcript of this video" and
while consuming it alongside with the video, you may find that it is an
incomplete transcript.


-- 
Sander Tekelenburg
The Web Repair Initiative: <http://webrepair.org/>
Received on Wednesday, 1 August 2007 21:09:30 UTC