Re: type resolution and control in subgrammars [was: RE: personal...]

** Summary:
* Comments on the current draft language: abstain.
* Comments on the process being followed: concur.

** Note
I would like to flag a potential angle touched on here.

There has been opinion in some quarters at some times that a type indication in HTTP headers should be the last word, unequivocally, as to the type to be used in processing information tranferred to a client by HTTP.  This would mean that type information encoded within an HTTP body or derived by trial and success processing of the data received should never be used in the presence of a type indication by a Content-Type: header in the HTTP message.

It would appear to me that there is some possibility that a finding would come out of the TAG in this area, and that it could, if the Voice WG is not alert, come out in a form that is not highest and best for the perceived or actual interests of the Voice WG, based on the history of this issue in this group.

Sorry this message is so bureaucratic.  But as Philipp and the group consider the further track of this issue in the W3C, I wanted to throw these observations into the hopper.

The brunt of my personal comments on this issue would be that we need a clear, consistent, and workable resolution across a broad front not limited to speech recognition grammars.  So the process path that the Speech WG is pursuing is the best that I could recommend, and is better than taking a product answer from me or from the WAI per_se.  That's a long-winded "decline to comment" on the precise form of the current draft language.

Al

At 05:56 PM 2002-05-09, Andrew Hunt wrote:
>Martin, Al,
>
>It has taken a month to get back to making an update to the media type
>language based on your most recent comments.  I'm sorry for the delay.  
>Below in this message is a revision to media type language that I hope 
>addresses your concerns.  The relevant paragraphs are marked with **.  
>I have also included a pointwise summary wrt your previous emails.
>
>I sincerely hope that the spec change included below addresses the
>points that you have raised and that it is satisfactory for progress
>to recommendation track.
>
>Again, thank you for the input on the speech recognition grammar spec
>and your responsiveness to the discussion.
>
>--Andrew Hunt
>  Co-editor SRGS
>  SpeechWorks
>
>
>--RESPONSE TO PREVIOUS REMARKS--
>
>2002/03/21: http://lists.w3.org/Archives/Public/www-voice/2002JanMar/0126.html
>> At the minimum, I would like to see an explanation of the problems
>> with using the type attribute on the link (a very clear example would
>> be that somebody has some linked speech grammar files in ABNF, and
>> converts them to the XML notation. All the type attributes in the
>> links to that grammar fragment will have to be updated, which may
>> be a real pain), and a commitment of the WG to (some of) the
>> proposals for getting things improved on the server side.
>
>Use of type is suggested only as a last resort and the format
>conversion issue is covered.  Both are in the third paragraph
>marked with **.
>
>We have raised this issue with Philipp Hoschka to seek guidance on
>next steps within the W3C.  At this point I cannot make commitments
>on behalf of the working group beyond escalating our experience on
>this topic.
>
>2002/03/20: http://lists.w3.org/Archives/Public/www-voice/2002JanMar/0119.html
>>- Making sure that the MIME type is actually registered.
>>   It's easier to convince a webmaster to add something to a config
>>   file if an author can point to a full registration, rather than
>>   just some 'customarily used' mime type.
>
>Media type requests have been published by IETF and are
>referenced from the specification:
>  http://www.ietf.org/internet-drafts/draft-porter-srgs-media-reg-00.txt
>  http://www.ietf.org/internet-drafts/draft-porter-srgsxml-media-reg-00.txt
>The applications have not been accepted yet.
>
>2002/03/20: http://lists.w3.org/Archives/Public/www-voice/2002JanMar/0119.html
>>- Saying clearly in the spec that Web server configurations should
>>   be updated to send the right mime type.
>
>Spec now states that type should be used only as a last resort when the 
>web server cannot be configured to return the correct media type.
>
>>- Using contacts of the WG (and the W3C overall if necessary) to
>>   make sure the configurations in the newest releases of the servers
>>   are up to date.
>
>The WG will be in a position to action on this point when the media
>type have been granted.
>
>>- Providing help on a public web page on how to set up various servers
>>   (see e.g. <http://www.w3.org/International/O-HTTP-charset.html>
>>     http://www.w3.org/International/O-HTTP-charset.html for
>>    an example; I'm glad to accept suggestions for improvement).
>
>The WG was reluctant to get into this arena.
>
>>- Ideally, combining the latent frustration about this issue in
>>   various WGs and Activities to some coordinated effort.
>
>We have raised this issue to Philipp Hoschka to seek guidance
>on next steps within the W3C.
>
>
>---DRAFT REVISED TEXT---
>
>2.2.2 External Reference by URI 
>
>References to rules defined in other grammars are legal under the conditions 
>defined in Section 3. The external reference must identify the external grammar 
>by URI and may identify a specific rule within that grammar. If the fragment 
>identifier that would indicate a rulename is omitted, then the reference targets 
>the root rule of the external grammar. 
>
>A URI reference is illegal if the referring document and referenced document 
>have different modes. For instance, it is illegal to reference a "dtmf" grammar 
>from a "voice" grammar. (See Section 4.6 for additional detail on modes.) 
>
>** A URI reference may be accompanied by a media type that indicates the content type 
>of the resource identified by the URI. When specified, the type value takes 
>precedence over other possible sources of the media type (for instance, the 
>"Content-type" field in an HTTP exchange, or the file extension). If present, 
>the media type is binding and cannot be ignored when parsing the referenced URI. 
>
>** When the content represented by a URI is available in many data formats, a 
>grammar processor may use the type to influence which of the multiple formats 
>is used. For instance, on a server implementing HTTP content negotiation, the 
>processor may use the type to order the preferences in the negotiation. 
>
>** Informative: use of the type attribute should be considered a last resort. For 
>instance, the type may be appropriate when a grammar is fetched via HTTP but 
>(1) a web server cannot be configured to indicate the correct media type, and 
>(2) the grammar processor is unable to automatically detect the media type. In 
>the event that a grammar is transformed to another form (e.g. ABNF Form to XML 
>Form) then any type attribute on a reference to that grammar also must be modified. 
>
>
>> -----Original Message-----
>> From: Martin Duerst [mailto:duerst@w3.org]
>> Sent: Monday, April 01, 2002 10:21 PM
>> To: Al Gilman; andrew.hunt@speechworks.com; www-voice@w3.org
>> Subject: RE: Personal comments on Speech Recognition Grammar Spec last
>> call
>> 
>> 
>> Hello Al,
>> 
>> After taking quite some time to think about it again,
>> I think I can very much agree with your position.
>> 
>> I very much hope that the Speech Recognition Grammar Spec
>> can be changed in that way, and would otherwise not be
>> satisfied with the resolution.
>> 
>> It took me quite a while to think about potential security
>> issues (there have been several recent security problems
>> where (mime) typing was involved). At the moment, my guess
>> is that there is not too much of a problem. The first point
>> would be that speech grammars are not security-relevant
>> (maybe a grammar with some recursive rules could create
>> an infinite loop and therefore a denial of service attack
>> on a bad implementation). But maybe I'm wrong here and
>> some components of the grammar could lead to execution
>> of some code. The second point is of course that a speach
>> grammar will be used when a speech grammar is referenced.
>> There is no general link functionality such as: When you
>> get here, display the referenced document.
>> So I think it should be okay.
>> 
>> Regards,    Martin.
>> 
>> At 16:16 02/03/21 -0500, Al Gilman wrote:
>> >Sorry to be a space cadet, but I want to reverse what I said a bit.
>> >
>> >Rather than say "the type indicated in the reference rules, when present" 
>> >better to say "In the case that the actual resource recovered bears an 
>> >indication of a type not suitable for processing, the type indicated in 
>> >the reference may be used to attempt a recovery from this error."
>> >
>> >[more below]
>> >
>> >At 08:47 PM 2002-03-20 , Martin Duerst wrote:
>> > >Hello Andrew,
>> > >
>> > >I'm sorry to bother you again. Based on Al's comments, I had a look
>> > >at your mail again, and found some very basic unclarity.
>> > >
>> > >At 14:03 02/03/18 -0500, Andrew Hunt wrote:
>> > >>Martin,
>> > >>
>> > >>As of the last email there were three outstanding issues.  Here is
>> > >>a summary of status/disposition.
>> > >
>> > >>20) It is a bad idea to have the media type specified
>> > >>     with the reference overwrite the media type determined from the
>> > >>     actual referenced resource
>> > >>
>> > >>After your most recent email the working group revisited the issue.
>> > >>We do plan any change to the specification.
>> > >
>> > >Do you plan some change, or do you not plan any change?
>> > >
>> > >At the minimum, I would like to see an explanation of the problems
>> > >with using the type attribute on the link (a very clear example would
>> > >be that somebody has some linked speech grammar files in ABNF, and
>> > >converts them to the XML notation. All the type attributes in the
>> > >links to that grammar fragment will have to be updated, which may
>> > >be a real pain), and a commitment of the WG to (some of) the
>> > >proposals for getting things improved on the server side.
>> > >
>> > >Regards,   Martin.
>> > >
>> > >
>> > >>Here is our analysis of
>> > >>the issue and the reason for remaining with the status quo.
>> > >>
>> > >>Your point that neither a resource specified by a URI, nor the bytes
>> > >>returned when resolving the URI, has a unique type, is well-taken.  This
>> > >>means for example that a server is perfectly justified to return an HTTP
>> > >>header indicating a type of, say, text/plain, when the bytes are in fact
>> > >>also a valid W3C grammar.  In such a case it's perfectly reasonable for
>> > >>the consumer of the resource to interpret those bytes (in a kind of
>> > >>casting operation) as some other type than the type indicated by the
>> > >>server, when the consumer of the resource has some "out-of-band" source
>> > >>of knowledge about the resource.  The "type" attribute provides a way
>> > >>for an application developer to provide this "out-of-band" knowledge to
>> > >>the consumer (voice browser in this case).
>> > >>
>> > >>This is useful in the common case where the VoiceXML application
>> > >>developer is also in control of the bytes that the URI resolves to, but
>> > >>may not be in control of the type information returned by the server.
>> > >>This is especially important for new types, where experience shows web
>> > >>servers are frequently not configured to return the most useful or most
>> > >>specific type information for resources that conform to the new type.
>> > >>
>> >
>> >AG::
>> >
>> >In the case of a conflict between the type that the application expects 
>> >from reading the referring grammar and the type that the transport asserts 
>> >for the entity transported, the processor does not have to go with just 
>> >one or the other.  Either one could be wrong.
>> >
>> >If one type reflects a potential success and the other type reflects a 
>> >sure failure, then the processor could take the optimistic route, 
>> >interpret the recovered resource representation in accordance with that 
>> >type and if the recognition process (parse, etc.) succeeds, go with it.
>> >
>> >
>> >For SGRS we have the added complexity that applicable grammars come in two 
>> >equivalent forms which are expected to have different MIME types.  Making 
>> >the type indication in the reference override the actual type of the 
>> >grammar sent will force errors when the type indication in the reference 
>> >is ABNF and the data returned are in XML, for example.  Should this be an 
>> >error?
>> >
>> >Would it possible for the processor to make the determination as to 
>> >whether XML or ABNF grammars are acceptable at this point and not the 
>> >referring grammar document?
>> >
>> >Al
>> >
>> > >>There is also precedent in recent W3C recommendations for such a "type"
>> > >>attribute:  from the SMIL 2.0 Recommendation, Chapter 7
>> > >>(http://www.w3.org/TR/smil20/extended-media-object.html):
>> > >>
>> > >>The
>> > >><http://www.w3.org/TR/smil20/extended-media-object.html#adef-media-type>
>> > >>type attribute value takes precedence over other possible sources of the
>> > >>media type (for instance, the "Content-type" field in an HTTP exchange,
>> > >>or the file extension).
>> > >>
>> > >>
>> > >>Please let us know if you have further comments on these issues.
>> > >>
>> > >>Regards,
>> > >>   Andrew Hunt
>> > >>   Co-editor SRGS
>> > >>   SpeechWorks International
>> > >
>> 

Received on Friday, 10 May 2002 19:20:06 UTC