W3C home > Mailing lists > Public > whatwg@whatwg.org > May 2010

[whatwg] Speech input element

From: Kazuyuki Ashimura <ashimura@w3.org>
Date: Tue, 18 May 2010 20:13:06 +0900
Message-ID: <4BF27642.4080602@w3.org>
Hi Bjorn and James,

Just FYI, W3C is organizing a workshop on Conversational Applications.
The main goal of the workshop is collecting use cases and requirements
for new models of human language to support mobile conversational
systems.  The workshop will be held on June 18-19 in Somerset, NJ, US.

The detailed call for participation is available at:
  http://www.w3.org/2010/02/convapps/cfp.html

I think there may be some discussion during the workshop about a
possible multimodal e-learning system as a use case.  Is either of you
by chance interested in the workshop?

Regards,

Kazuyuki


Bjorn Bringert wrote:
> On Mon, May 17, 2010 at 10:55 PM, James Salsman <jsalsman at gmail.com> wrote:
>> On Mon, May 17, 2010 at 8:55 AM, Bjorn Bringert <bringert at google.com> wrote:
>>>> - What exactly are grammars builtin:dictation and builtin:search?
>>> They are intended to be implementation-dependent large language
>>> models, for dictation (e.g. e-mail writing) and search queries
>>> respectively. I've tried to clarify them a bit in the spec now. There
>>> should perhaps be more of these (e.g. builtin:address), maybe with
>>> some optional, mapping to builtin:dictation if not available.
>> Bjorn, are you interested in including speech recognition support for
>> pronunciation assessment such as is done by http://englishcentral.com/
>> , http://www.scilearn.com/products/reading-assistant/ ,
>> http://www.eyespeakenglish.com/ , and http://wizworldonline.com/ ,
>> http://www.8dworld.com/en/home.html ?
>>
>> Those would require different sorts of language models and grammars
>> such as those described in
>> http://www.springerlink.com/content/l0385t6v425j65h7/
>>
>> Please let me know your thoughts.
> 
> I don't have SpringerLink access, so I couldn't read that article. As
> far as I could tell from the abstract, they use phoneme-level speech
> recognition and then calculate the edit distance to the "correct"
> phoneme sequences. Do you have a concrete proposal for how this could
> be supported? Would support for PLS
> (http://www.w3.org/TR/pronunciation-lexicon/) links in SRGS be enough
> (the SRGS spec already includes that)?
> 

-- 
Kazuyuki Ashimura / W3C Multimodal & Voice Activity Lead
mailto: ashimura at w3.org
voice: +81.466.49.1170 / fax: +81.466.49.1171
Received on Tuesday, 18 May 2010 04:13:06 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:59:23 UTC