- From: John Foliot <jfoliot@stanford.edu>
- Date: Wed, 22 Jun 2011 18:32:00 -0700 (PDT)
- To: "'HTML Accessibility Task Force'" <public-html-a11y@w3.org>
- Cc: "'Kelly Ford'" <Kelly.Ford@microsoft.com>, "'Jim Allan'" <jimallan@tsbvi.edu>, <jeanne@w3.org>, 'Léonie Watson' <lwatson@nomensa.com>, "'Richard Schwerdtfeger'" <schwer@us.ibm.com>
Thanks to Kelly Ford, Jim Allan, Jeanne Spellman, Leonie Watson and Rich Schwerdtfeger for joining the media sub-team call this week. I'd like to try and lay this out from a user-requirements perspective first, to be sure of what I think we need from a user-perspective is clearly understood and agreed to (and I am prepared to hear that I am wrong). Then an examination of the history up to where we are today (I think) - all of course from my perspective. The scenario: we have a web-page that contains, visually, a bounding box that represents where the video will play. Above that box we have a title - Gone With The Wind - and below the box we have a paragraph of text: "American classic in which a manipulative woman and a roguish man carry on a turbulent love affair in the American south during the Civil War and Reconstruction." In code, we have the following: <h1>Gone With The Wind</h1> <video src="movie.mp4"></video> <p> American classic in which a manipulative woman and a roguish man carry on a turbulent love affair in the American south during the Civil War and Reconstruction.</p> (yes we should also have <track src="caption file"> but assume it's there) Semantically speaking, we have a short name and a longer description, which is addressing the movie itself. The implied semantics exist, but for clarity the specific semantics are further defined (perhaps because there is more than one movie on the page): <h1 id="movieTitle">Gone With The Wind</h1> <video src="movie.mp4" aria-labeledby="movieTitle" aria-describedby="description"></video> <p id="description"> American classic in which a manipulative woman and a roguish man carry on a turbulent love affair in the American south during the Civil War and Reconstruction.</p> So far, so good. However, inside of the bounding box there is a static image. Let's not get bogged down on the source of that imagery (either via @poster, or the first frame of the video, or whatever), but let's agree that the image is the original movie poster from Gone With The Wind, as seen here: http://ia.media-imdb.com/images/M/MV5BMjE1MTk0MTE5NF5BMl5BanBnXkFtZTYwMTUx Nzg4._V1._SY317_CR2,0,214,317_.jpg (Note: the graphic image could just as easily be an image of the MGM Lion, or a Green Screen Parental rating guide, or an advert for tooth-paste. The choice of the word "poster" has introduced some misunderstandings that need to be acknowledge as well) For the non-sighted users reading this, a longer textual description of the imagery would be: "Clark Gable embraces Viven Leigh, staring into her eyes romantically. In the background is an ominous fire-red sunset and the silhouette of trees and a couple arm-in-arm in the distance. The poster reads David O. Selznick's adaptation of Margaret Mitchell's Gone with the Wind. Winner of 10 Academy Awards." The semantic question becomes, is this a description of the movie, or of the movie poster? If we can agree that it is the movie poster, and that it is important that a means of linking that descriptive text to the multi-media asset is important, then *HOW* do we do it? And importantly (as in the case of the whole @longdesc debate), how do we do it when we know that from a visual/design perspective most designers will likely not want that rich textual description visible on screen. I believe that here we have introduced some new yet different semantic information. It is clearly related to the multi-media experience, yet it's not *really* the movie, it's the precursor to the movie. But it is also a rich visual experience, further complicated by the fact that there is text embedded into that image. ********* Originally, I had proposed we should deal with the uniqueness of this not-movie visual expression - the "poster" - by introducing a child element of video, like this: <h1 id="movieTitle">Gone With The Wind</h1> <video src="movie.mp4" aria-labeledby="movieTitle" aria-describedby="description"> <poster alt="David O. Selznick's adaptation of Margaret Mitchell's Gone with the Wind. Winner of 10 Academy Awards." longdesc="file-with-the-rest-of-the-description.html"> </video> <p id="description"> American classic in which a manipulative woman and a roguish man carry on a turbulent love affair in the American south during the Civil War and Reconstruction.</p> Note that in the example above, the alt text is *more* than the Title in the <h1>, and there is no SRC attribute, because the imagery would be derived from the first frame of "movie.mp4"; should the imagery be an actual discrete JPG, it could then be referenced by SRC, like this: <poster alt="David O. Selznick's adaptation of Margaret Mitchell's Gone with the Wind. Winner of 10 Academy Awards" longdesc="file-with-the-rest-of-the-description.html" src="poster.jpg"> (note, this proposal would have made @poster as an attribute of <video> obsolete, as the specifying of the JPG file would move from being an attribute of the <video> element to becoming an attribute of the child element of <poster> - or in my proposal <firstframe>) Here, the not-movie visual imagery has a short name (provided by @alt) and a means for associating a longer description (using @longdesc). This proposal was rejected by the Working Group chairs (Issue 142) as they claimed that... well, I'm not really sure what their claim was, but it suggested that I was proposing a broken element (because presumably sometimes @src could be omitted which is perfectly valid - at least that was my reading of the decision - http://lists.w3.org/Archives/Public/public-html/2011Mar/0690.html) - Oh, that and their failure to read that I was not proposing actual spec text per-se (I even specifically asked for assistance), which is the grounds for my current, active Formal Objection on Issue 142. (I have indicated that should we solve the *problem* however that I would remove the FO, as results are more important to me than religion.) ********* As we returned to the this issue, Silvia (and I) re-examined the requirements and cooked-up a different approach. It leverages ARIA a little more than the initial suggestion I had, but on paper it looked like it could still solve the larger requirement set. Using the same example, but re-written in this new approach, we would have the following: <h1 id="movieTitle">Gone With The Wind</h1> <video src="movie.mp4" aria-labeledby="movieTitle" aria-describedby="description poster"> <p id="poster">David O. Selznick's adaptation of Margaret Mitchell's Gone with the Wind. Winner of 10 Academy Awards. A full description of the poster is <a href="file-with-the-rest-of-the-description.html">also available</a>.</p> </video> <p id="description">American classic in which a manipulative woman and a roguish man carry on a turbulent love affair in the American south during the Civil War and Reconstruction.</p> With this, we have again captured what I believe to be all of the discrete semantics, and while I have some questions about user-experience, I was generally satisfied that for a 'professional' authoring of this by a developer, all of the tools the author needed where there. ********* The questions/concerns I had focus on a few specific behaviors - and what, if anything we can do, should we do, and *who* should be doing what? They are: 1) My concern about the concatenation of the two descriptions into a flat reading. Is this a problem? When a screen reader focuses on the <video> element, my understanding today is that what would be read aloud would be: "David O. Selznick's adaptation of Margaret Mitchell's Gone with the Wind. Winner of 10 Academy Awards. A full description of the poster is also available. American classic in which a manipulative woman and a roguish man carry on a turbulent love affair in the American south during the Civil War and Reconstruction." (For example, is the 'pausing' caused by the period after the words "Wind" and "available" preserved, or will the speech synthesizer just plow on through as one run-on sentence? Do we need a 'longer' pause between the description of the movie and the description of the poster? If yes, how do we do this?) 2) I have a concern that apparently HTML-rich text being passed to the Accessibility API is being "flattened" - i.e. none of the HTML-richness is preserved. This would thus kill off the link being provided by: "A full description of the poster is also available." (This has surfaced in the @longdesc discussion as well: apparently Firefox is preserving the richness - needs to be tested/confirmed - but the other browsers are not. This might be a deal-breaker here.) 3) Order of reading: I presume that the aria-describedby texts are read/rendered in the order they are authored. In other words, if I reversed the order of the attribute values (...aria-describedby="poster description">...) then what is passed forward would be: "American classic in which a manipulative woman and a roguish man carry on a turbulent love affair in the American south during the Civil War and Reconstruction. David O. Selznick's adaptation of Margaret Mitchell's Gone with the Wind. Winner of 10 Academy Awards. A full description of the poster is also available." Is this a problem (I can see where it might be sometimes)? Is this addressed exclusively as authoring guidance, or is there a way we can specify rendering order regardless of authoring order? Is this worth worrying about? 4) We have 2 paragraphs of textual description, describing 2 discrete things. Yet which paragraph is describing which thing? In the example I have used IDs of "description" and "poster" for clarity or examples, but we already know that IDs are machine readable but carry no semantics - I could have just as easily used the IDs of "this" and "that" - they would have worked as association "hooks", but no semantics are being passed along. My thoughts are that we could either investigate introducing new aria roles (but hear concerns of feature creep), or should we also look to use aria-label, like this: <p id="that" aria-label="poster description">David O. Selznick's adaptation of Margaret Mitchell's Gone with the Wind. Winner of 10 Academy Awards. A full description of the poster is <a href="file-with-the-rest-of-the-description.html">also available</a>.</p> Again, is this richness preserved or flattened? As an author, would writing this have any difference: <div id="that"><p aria-label="poster description">David O. Selznick's adaptation of Margaret Mitchell's Gone with the Wind. Winner of 10 Academy Awards. A full description of the poster is <a href="file-with-the-rest-of-the-description.html">also available</a>.</p></div> ...where the <div>'s ID provides the association, but the <p> and it's aria-label is semantically preserved? Do we need this? Do we have this? It has been discussed that some of these issues are browser-implementation issues, and that bugs need to be filed at that level (Eric reconfirmed this point on last week's call) - however, to do that, it seems that the ARIA CR is not specific enough (sorry Rich/PF), so is there something we can do to address this problem? Does either of these proposals appear to be superior to the other, or is it ToMayto versus ToMato? Is there another way forward? Friends, I truly am agnostic on *how* we solve this problem. While I continue to think that my initial proposal of introducing a new child of <video> could work, I am also convinced that if we can work out the wrinkles of this second proposal that it too would address the needs requirements. And so, thoughts? JF
Received on Thursday, 23 June 2011 01:32:40 UTC