- From: Dan Brickley <danbri@w3.org>
- Date: Tue, 17 May 2005 11:45:42 -0400
- To: "Miles, AJ (Alistair)" <A.J.Miles@rl.ac.uk>
- Cc: public-esw-thes@w3.org, mf@w3.org
* Miles, AJ (Alistair) <A.J.Miles@rl.ac.uk> [2005-05-17 14:47+0100]
>
>
> > Other extensions could also be interesting. While we could
> > debate which
> > things go in core vocab and which in other namespaces, it
> > might be more
> > fun to set that aside for now (while noting that SKOS is only
> > a Working
> > Draft at this stage, and could change), and explore possibilities
> > for such extensions. Was there something specific you had in
> > mind? Audio
> > I think could be very interesting, particularly for SKOS
> > concept that is
> > close to the electronic dictionary space, eg. lexical databases such
> > as Wordnet (although SWBPD WG isn't using SKOS for Wordnet currently).
> > Where a concept is lexicalised, we could point to sound clips, or
> > Speech Synth markup (eg. see
> > http://www.w3.org/TR/2004/REC-speech-synthesis-20040907/)
> > ...could have
> > interesting application to accessibility, voice/mobile and perhaps
> > language learning apps...
>
> I like the idea of 'audio labels' ... Can anyone describe a relatively concrete use case?
Autogenerating voice-browser menus?
http://www.w3.org/Voice/
http://www.w3.org/Voice/#intro
http://www.w3.org/Voice/Guide/
[[
VoiceXML isnt HTML. HTML was designed for visual Web pages and lacks the
control over the user-application interaction that is needed for a
speech-based interface. With speech you can only hear one thing at a
time (kind of like looking at a newspaper with a times 10 magnifying
glass). VoiceXML has been carefully designed to give authors full
control over the spoken dialog between the user and the application. The
application and user take it in turns to speak: the application prompts
the user, and the user in turn responds.
VoiceXML documents describe:
* spoken prompts (synthetic speech)
* output of audio files and streams
* recognition of spoken words and phrases
* recognition of touch tone (DTMF) key presses
* recording of spoken input
* control of dialog flow
* telephony control (call transfer and hangup)
]]
Annotation of SKOS concept descriptions with voice data (speech markup,
or audio files, ...) could allow content tagged with those concepts to
be made navigable through VoiceXML-based interactions. Example: a
collection of blog feeds, where the RSS was augmented with skos:subject
tagging, and the different blog drew on the same (or mapped) concept
schemes.
[I'm working on some tools to enable blogs to pick up their SKOS
categories from their neighbours (eg. when I go to add a category to
my blog, it reminds me what categories my friends and colleagues are
using, and allows links to be expressed, sub-trees to be imported).]
So, why would one want to navigate blogs by having the computer read
out labels for their categories? (and eg. also navigating by voice
too).
- maybe you're driving your car, and in a traffic jam
- RSI or other accessibility reasons for not using mouse/keyboard
- you're walking around wearing some fancy bluetooth headset,
looking all Flash Gordon modern, and want to read what people are
writing about you...
- maybe you're navigating some content collection via your TV, with
menus, and prefer audio to reading of (even large) fonts
on the TV screen.
- maybe you're navigating a content collection in audio labels made
available in your native spoken language, even if the content is
in a language you're less profficient in.
- maybe you can't read the textual labels (in the language they're
available in; or in any language).
- maybe you're navigating a collection of Creative Commons-licensed
Ogg/MP3 'talking book' files on your iPod-like-thing, and someone has
written a study guide that lets you jump around the texts based on
SKOS-indexed themes that have been collaboratively indexed against
the collection. Ok handwaving a bit here, but I think that could be
interesting...
Whether the final end document is read in classic Web browser,
or also via text to speech (Max was looking at this...) is a separable
choice I think. Being able to navigate around the content database
using audio labels doesn't require you to digest the content in audio
form too.
cheers,
Dan
ps. is anyone on this list set up to run student projects? maybe on in this
area could be interesting...?
Received on Tuesday, 17 May 2005 22:32:25 UTC