W3C home > Mailing lists > Public > www-voice@w3.org > January to March 2001

Re: Question on Stochastic Language Models (N-Gram) Specification WD 3 January 2001

From: Michael K. Brown <mkb@avaya.com>
Date: Wed, 14 Feb 2001 15:52:18 +0000
Message-ID: <3A8AA9B2.76F6978@avaya.com>
To: Paul van Mulbregt <paulvm@ne.mediaone.net>
CC: www-voice@w3.org
Paul van Mulbregt wrote:
> 
> After reading this specification (http://www.w3.org/TR/ngram-spec) I was
> left somewhat confused about its purpose.  Could one of the authors perhaps
> explain exactly what problem this spec is trying to solve?  How is it
> envisioned that data in this format would be generated and then used?
> 
> Regards,
> -Paul
> 
> -------------------------------------------------------------
> Paul van Mulbregt,  paulvm@ne.mediaone.net


N-grams are used in large and open vocabulary applications where natural
language is desired.  Such applications allow the user to say
practically anything they want and expect the system to interpret at
least a significant part of the utterance.  A finite-state or even
context-free grammar has the advantage of lower entropy, meaning higher
speed and accuracy, but has the disadvantage of being highly fragile in
the sense that it's easy to talk outside the language model.  For
example, I built a speech controlled robotic dialog system called SAM in
the mid-1980's that, in addition to the robot's environmental sensors,
used finite-state ASR and an OPS-5 based dialog/task planner for the
user interface.  Disregarding cycles, the grammar accepted about 6x10^20
command sentences, which seems large, but in practice it was not hard to
step outside the accepted language (even while staying within the
vocabulary).  The perplexity was only about 3.5 overall, so speed and
accuracy were quite high.

N-gram systems are near commercial deployment from a number of providers
including AT&T, Avaya, Lucent, and Philips (I think) - the first two are
certain.  For example, AT&T has been in trial with a customer service
system that asks "how may I help you?" allowing a caller to say
virtually anything.  The system recognizes key phrase components and
directs the call to the right service.  Avaya/Lucent have been trialing
a banking applications with similar characteristics.

This technology has been around for many years now and the companies I
mentioned have their own tools for creating n-gram models.  You can find
free tools at ftp://ftp.cs.cmu.edu/project/fgdata/CMU_SLM/.

	Mike
-- 
		Michael K. Brown
		Avaya Labs, Rm. 2D-534, (908) 582-5044
		600 Mountain Ave., Murray Hill, NJ 07974
		mkb@avaya.com
Received on Wednesday, 14 February 2001 10:53:38 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 October 2006 12:48:53 GMT