Speech Rec Grammar Draft

I have some comments and questions regarding the W3C Working Draft for Speech Recognition Grammar.

I find most of the spec reasonably strait forward.  However, an area which is not clear to me is how you extract meaning from an utterance.

In section 2.6, the spec defines the "tag" attribute.  While this is a step in the right direction, I think it falls short.   The example provided in 2.6 shows tag attributes which define name value pairs.  In the example, one pattern is mapped to tag="action=open" and another is mapped to tag="action=shut".  If I can borrow the term "slot" which Nuance uses, then you could say that "action" is a slot and "open" and "close" are values that fill the slot named "action".  To make the pattern more reusable, it might be useful to define another attribute which defines the slot name.  Maybe, you call the new attribute "slot" or "slotname".   Slot names would be defined in some ancestor element one or more levels up from elements which define tag attributes.

So the example in 2.6:

<one-of>
    <item tag="action=open;"> open </item>
    <item tag="action=shut;"> close </item>
</one-of>

Could be instead written as

<one-of slot="action">
    <item tag="open"> open </item>
    <item tag="shut"> close </item>
</one-of>

Received on Tuesday, 31 July 2001 15:48:45 UTC