- From: Guillaume Berche <guillaume.berche@eloquant.com>
- Date: Thu, 1 Apr 2004 09:32:57 +0200
- To: <www-voice@w3.org>
Hello, I would like some precisions on the SRGS Proposed Recommendation from 18 December 2003 concerning the behavior of a conformant processor with respect to grammar ambiguity. I could not find an answer in the mailing archive nor in the IR test suite. Please point me to the appropriate section if I missed it, or otherwise please register a change request on this point to clarify this aspect of the specification. Thanks for your help and best regards, Guillaume. ---------------------------------------------------------- Guillaume Berche Eloquant, ZA Le malvaisin 38240 Le Versoud, France guillaume.berche@eloquant.com ---------------------------------------------------------- I understand the SRGS specification is not clear about how ambiguities in points of disjunction are handled. While section "H.1 Terminology and Notation" does include an "Ambiguity" paragraph with the content reproduced below, it does not explicitely require that a grammar processor to select **at least one** of the alternatives in case of ambiguity. "An ambiguity occurs when for a specified sequence of input tokens matched to a specified rule of a grammar there is more than one distinct logical parse structure that can be produced. An ambiguity can occur at points of disjunction (choice) in a grammar. Disjunction exists with the use of alternatives and repeats. A grammar processor may preserve any number of ambiguous logical parse structures to create a set of alternative logical parse structures for the input. It is legal for a grammar processor to maintain all possible logical parse structures or to dispose of all but one of the alternatives. There is no specified behavior for selection of ambiguities amongst possibilities by a grammar processors. As a result grammars that contain ambiguity do not guarantee portability of performance. Developers and grammar tools should be ambiguity-aware." In the following example with an ambiguous grammar, only output 1 would lead to a match, the second incomplete output leading to a rejection (no match). Is a conformant grammar processor required to return output1 or would output2 be also a conformant behavior, given that section "5.4 Conforming XML Form Grammar Processors" states that "There is, however, no conformance requirement with respect to performance characteristics of the XML Form Grammar Processor. For instance, no statement is required regarding the accuracy, speed or other characteristics of a speech recognizer or DTMF detector." Expansion: t0 (t1 {tag1}) <0-2> t1 t2 Input: t0,t1,t1,t2 Output 1: ["t0",{!{tag1}!},"t1","t1","t2" ] Output 2: [REJECT after evaluating: "t0",{!{tag1}!},"t1",{!{tag1}!},"t1"] If output1 is required to be returned by all conformant grammar processors, then this may prevent important optimizations in the grammar processor implementations, without real benefits for grammar authors since in presence of other ambiguous grammars the output is not predicatable. Example above is quite simple, much more complex cases may occur which requires the grammar processor to preserve a very large amount of logical parse output and may severely impact performance. Lots of grammar languages with real-life applications provide some indicators to the grammar processor as to limit the amount of considered branch and avoid combinatorial explosion (see references below), or disallow ambiguity altogether. Some provide lookahead hints, other define local optimization strategy such as greedy behavior at disjonction points. I believe a reasonneable approach to solve this problem could be to modify section "H.1 Terminology and Notation" paragraph "Ambiguity" to precise that it is legal for a grammar processor to be "greedy" in matching each expansion (i.e. select the alternative which matches most available tokens). This behavior usually leads to expected global output sequence but may sometimes reject valid input as illustrated in the example above. References: JavaCC Lookahead https://javacc.dev.java.net/doc/lookahead.html yacc default lookahead http://www.cs.princeton.edu/~appel/modern/ml/ml-yacc/manual.html#section5 Perl5 regular expressions and greedy quantifiers http://iis1.cps.unizar.es/Oreilly/perl/prog/ch02_04.htm#PERL2-CH-2-SECT-4.1. 2 ANTLR lookahead http://www.doc.ic.ac.uk/lab/secondyear/Antlr/metalang.html#_bb21 XML languages and ambiguity http://www.oasis-open.org/cover/topics.html#ambiguity
Received on Thursday, 1 April 2004 02:33:02 UTC