- From: Charles McCathieNevile <charles@w3.org>
- Date: Mon, 24 Nov 2003 06:27:41 -0500 (EST)
- To: WAI GL <w3c-wai-gl@w3.org>
During the face to face meeting I said I would describe some approaches to clarifying pronunciation of content. I suggested three possiblities, and here try to explain them a little more. These are not complete examples, and each one requires some work to be done before it might be useful. 1. Use Ruby. As Nakane-san pointed out at the meeting, you cannot simply use the Ruby element to specify pronunciation. But it can be used to help, if you provide appropriate presentation clues. For example: ...Steve <ruby class="sayAs"><rb>Waugh</rb><rt>War</rt></ruby> ... combined with a stylesheet for audio presentation - for example @media aural { ruby.sayAs rb { speak: none } } @media screen { ruby.sayAs rt { display: none } } and an alternative stylesheet for people who are using screen readers that only understand visual styling ruby.sayAs rb { display:none } This will work moderately well in modern browsers. There are few implementations of Aural CSS (not none, as commonly believed, and a recently-released commercial product is one of them). Common screen readers do not implement Aural CSS, and attempt to present content according to the visual presentation (where pronunciation isn't usually important). So the user of such a screen reader will need to know that they should switch to the alternate stylesheet. There are issues to iron out with spelling - where swapping stylesheets again is important... 2. Use SSML The Speech Synthesis Markup Language is a W3C Specification designed for Voice applications. It includes markup explicitly for pronunciation. Using mixed-namespace XML, this markup could be included in HTML. The XHTML+Voice member submission at http://www.w3.org/TR/xhtml+voice/ shows one approach to doing this (also including a lot of other elements). (The Staff Comment on the submission -- http://www-3.ibm.com/software/pervasive/multimodal/x%2Bv/11/spec.htm -- notes that there were problems in using the original specification in a royalty-free manner. An updated version is available linked from that comment, which also suggests the use of Aural CSS). 3. Annotea The annotea work allows user-defined, machine-readable annotations to be made on any part of a document. Because it uses Xpointer, it can annotate a word, or even a part of one, as well as a paragraph. Annotations are potentially unstable across editing, so this appraoch should be used only after considering the implications. 4. As an extra thought, linking a glossary that contains pronunciation to a particular document is something that would follow naturally from linking one for clarity of words. One would expect it to build on SSML, and be re-usable with annotations... So those are some ideas. Is it worth following them up? cheers Chaals Charles McCathieNevile http://www.w3.org/People/Charles tel: +61 409 134 136 SWAD-E http://www.w3.org/2001/sw/Europe fax(france): +33 4 92 38 78 22 Post: 21 Mitchell street, FOOTSCRAY Vic 3011, Australia or W3C, 2004 Route des Lucioles, 06902 Sophia Antipolis Cedex, France
Received on Monday, 24 November 2003 12:12:33 UTC