Use cases Re: Re[2]: FW: acronym in title...

I don't think that the 'neat text-to-speech-for-blind-folks' is the major
motivation here.

One approach that has been discussed is being able to annotate text with a
symbolic representation. Languages based on images are used by a number of
people who are restricted to communicating through a board that lets them
select images.

Likewise, Voice applications are being generated which allow a limited range
of choices to accomplish a task. And people are interested not just in what
'Piazzo San Marco' means when the words are translated into english, but what
it means in a more general sense.

Microsoft introduced its 'smart tags' to howls of protest not about the
functionality, but about the limited range of destinations the links pointed
to. Having a framework that allows anybody to offer a link, and the user to
choose which offers they are interested in, may reduce the unease some people
have about what is in some ways a valuable functionality.

I once had a job electronically archiving stories on behalf of an aboriginal
community. They had a complex set of use cases, including needing to be able
to look up any word in a dictionary and find out what it was in another
language (the originals were in a group of 31 cognate languages) and whether
a particular word might be Taboo (this applied most often to names) under
certain circumstances. Without being able to satisfy these use cases they
were not prepared to make the stories available to the wider community even
at the risk of seeing them lost forever.

Voice services pitch themselves on the basis of (among other things) speech
quality. With a name like mine I would dearly love to be able to annotate it
with a good pronunciation algorithm. As would the fictional Hyacinth Bucket
(it's pronounced like bouquet in french - boo-kay in north eastern american
english is close), although my friend Kevin Bucket wants to have the right
pronunciation for his name (it's pronounced like bucket). Set p one voiceXML
document and let two service providers offer access to it, enhancing it as
they see fit...

I believe there are a multitude of scenarios. I think you are correct that
having good element structure helps in most cases. I think this is because in
any case the perceived cost of implementing the solution has to match the
collective benefits...

cheers

Chaals

On Fri, 14 Mar 2003, Mark Davis wrote:

>
>I wonder how realistic these scenarios are. The principal motivation for a
>fine-grained language tagging of individual words or phrases appears to be
>for text-to-speech, primarily for the blind. But the goal for
>text-to-speech, except in very rare cases, will be to read the customary,
>most-well-understood pronunciation of the phrase in the end-user's language.
>Rarely will that precisely match the exact pronunciation of the word in the
>foreign language.
>
>[That being said, I have always found the attributes that end up being
>displayed to the user, such as alt and title, very confining. It would be
>better to have elements that correspond to them, so that the text display
>can be richer, such as italicising or bolding a word within them.]
>
>Mark

Received on Friday, 14 March 2003 10:59:23 UTC