Personal comments on Speech Recognition Grammar Spec last call from Martin Duerst on 2001-10-09 (www-voice@w3.org from October to December 2001)

From: Martin Duerst <duerst@w3.org>
Date: Tue, 09 Oct 2001 17:38:58 +0900
To: www-voice@w3.org
Message-Id: <4.2.0.58.J.20011009120542.03ab5b60@localhost>
Dear Voice Browser WG,

Below please find my personal comments to the last call
of your Speech Recognition Grammar WD of August 20.

Please note that I am not subscribed to www-voice@w3.org,
so please make sure you include my address in responses.


1) 1.1: The term DTMF appears many times before it is introduced.
    Please give its expansion/explanation the first time it appears.

2) Conversion between formats: In case two formats are kept,
    to have fully implemented conversion tools in both directions
    implemented and tested should be a CR exit requirement.

3) '1.4 Semantic Interpretation': The term 'semantics' is heavily
    used (and misused) in various contexts. The use here bears
    a serious risk of confusion with the Semantic Web. It is
    difficult to understand what exactly this 'semantic interpretation'
    is supposed to be, but at the moment, it looks mostly like
    events fired upon detection of some input, with associated
    scripts. In that case, it would be much better to use the
    events/script terminology and maybe also the syntax.

4) 1.4 (and other places): "... the WG plans to require":
    Requiring the use of another spec as of yet undefined or still
    in the works by the current specification (which is in last
    call) is really strange. Also, it does not seem adequate here;
    the specifications are on purpose written independently and
    so that other 'semantic interpretation' things can be used.
    It is better to assure interoperability on a higher level,
    e.g. by defining a profile or by just making all these
    things W3C Recs.

5) 2.1 Token: The section title seems inadequate. The section
    should either speak only about single tokens, or the title
    should be 'Token List' or something similar.

6) 2.1: Defining empty tokens or tokens containing only space
    as illegal seems completely unnecessary and only complicates
    the spec. These cases should be defined as equivalent with
    special='NULL'. Allowing ( ) or () further complicates
    the issue unnecessarily.

7) Please use XLink syntax (xlink:href) instead of the 'uri'
    attribute.

8) 2.2.2: application/grammar+xml: The term 'grammar' seems much
    too general here. Also, the spec says that this media type has
    been requested, but I'm following the relevant IETF list and
    do not remember such a request. Can you please provide a pointer
    to the archive?

9) Many of the <pre> examples get cut when printed. This depends
    very much on the device, but I would suggest an upper limit
    of about 60 characters per line.

9) 2.2.4 Special rules: You should consider reserving some
    rule names for future extension (e.g. all uppercase only names)

10) 2.3: the term 'legal rule expansion' is used here but not yet
     defined.

11) 2.3: The matching is described multiple times in various very
     different places and words. E.g.: 'expansions /must/ be spoken...',
     'must be recognized in temporal sequence',...

12) 2.4: 'a set of alternatives must contain one or more alternatives':
     Why not 'zero or more alternatives'? Zero would be equivalent to
     special='VOID'. On the other hand, at the end of 2.4, the text
     says that an empty <one-of> is allowed.

13) 2.5, special case of <0> or <0-0>: Change the explanation to say
     that this is the same as special='NULL'.

14) There seems to be no need for both {!{  }!} and backslash escaping.
     Please use only one mechanism, preferably backslash escaping.

15) 3., second paragraph: 'the rulename resolution specification'
     Is this a separate spec, or part of this spec?

16) 3.2 Scoping: The scoping rules are explained by reference to Java.
     However, while there are very good reasons for data hiding and
     therefore to have a default of 'private' in programming languages
     such as Java, I can see absolutely no need for data hiding in
     the case of speech grammar rules. There should be a better
     explanation for why the default is 'private', or the default
     should be changed to 'public' to make it easier to reuse rules.
     Also, the rule that a private root can not be referenced by
     name seems unmotivated.

17) The choice of 'ABNF' as a magic number seems to be much to
     general. (see above for application/grammar+xml). Similar
     considerations apply to the chosen public identifier (and the
     namespace), as well as the use of the term 'XML Grammar'
     in the place of 'XML Speach Recognition Grammar'.

18) 4.1.4: I think it would be a good idea to change
     <grammar root="rulename" ...> to <grammar root="#rulename" ...>

19) 4.1.5: Please use an URI as the identifier for 'semantic' tags.

20) 2.2.2 and 4.2: It is a bad idea to have the media type specified
     with the reference overwrite the media type determined from the
     actual referenced resource.

21) RDF rather than the html-like ad-hoc <meta> should be used for
     metadata.


Regards,     Martin.

#-#-#  Martin J. Du"rst, I18N Activity Lead, World Wide Web Consortium
#-#-#  mailto:duerst@w3.org   http://www.w3.org/People/D%C3%BCrst
Received on Tuesday, 9 October 2001 04:39:08 UTC