- From: Charles McCathieNevile <charles@w3.org>
- Date: Fri, 14 Mar 2003 10:59:17 -0500 (EST)
- To: Mark Davis <mark.davis@jtcsv.com>
- cc: Roberto Scano - IWA/HWG <rscano@iwa-italy.org>, <ishida@w3.org>, <w3c-i18n-ig@w3.org>, <public-i18n-geo@w3.org>, Al Gilman <asgilman@iamdigex.net>, <w3c-wai-gl@w3.org>
I don't think that the 'neat text-to-speech-for-blind-folks' is the major motivation here. One approach that has been discussed is being able to annotate text with a symbolic representation. Languages based on images are used by a number of people who are restricted to communicating through a board that lets them select images. Likewise, Voice applications are being generated which allow a limited range of choices to accomplish a task. And people are interested not just in what 'Piazzo San Marco' means when the words are translated into english, but what it means in a more general sense. Microsoft introduced its 'smart tags' to howls of protest not about the functionality, but about the limited range of destinations the links pointed to. Having a framework that allows anybody to offer a link, and the user to choose which offers they are interested in, may reduce the unease some people have about what is in some ways a valuable functionality. I once had a job electronically archiving stories on behalf of an aboriginal community. They had a complex set of use cases, including needing to be able to look up any word in a dictionary and find out what it was in another language (the originals were in a group of 31 cognate languages) and whether a particular word might be Taboo (this applied most often to names) under certain circumstances. Without being able to satisfy these use cases they were not prepared to make the stories available to the wider community even at the risk of seeing them lost forever. Voice services pitch themselves on the basis of (among other things) speech quality. With a name like mine I would dearly love to be able to annotate it with a good pronunciation algorithm. As would the fictional Hyacinth Bucket (it's pronounced like bouquet in french - boo-kay in north eastern american english is close), although my friend Kevin Bucket wants to have the right pronunciation for his name (it's pronounced like bucket). Set p one voiceXML document and let two service providers offer access to it, enhancing it as they see fit... I believe there are a multitude of scenarios. I think you are correct that having good element structure helps in most cases. I think this is because in any case the perceived cost of implementing the solution has to match the collective benefits... cheers Chaals On Fri, 14 Mar 2003, Mark Davis wrote: > >I wonder how realistic these scenarios are. The principal motivation for a >fine-grained language tagging of individual words or phrases appears to be >for text-to-speech, primarily for the blind. But the goal for >text-to-speech, except in very rare cases, will be to read the customary, >most-well-understood pronunciation of the phrase in the end-user's language. >Rarely will that precisely match the exact pronunciation of the word in the >foreign language. > >[That being said, I have always found the attributes that end up being >displayed to the user, such as alt and title, very confining. It would be >better to have elements that correspond to them, so that the text display >can be richer, such as italicising or bolding a word within them.] > >Mark
Received on Friday, 14 March 2003 10:59:23 UTC