- From: Olli Pettay <Olli.Pettay@helsinki.fi>
- Date: Thu, 30 Jun 2011 19:05:50 +0300
- To: "public-xg-htmlspeech@w3.org" <public-xg-htmlspeech@w3.org>
- CC: Michael Bodell <mbodell@microsoft.com>, "Raj (Openstream) (raj@openstream.com)" <raj@openstream.com>, "Deborah Dahl (dahl@conversational-technologies.com)" <dahl@conversational-technologies.com>, "Dan Burnett (dburnett@voxeo.com)" <dburnett@voxeo.com>, "Bjorn Bringert (bringert@google.com)" <bringert@google.com>, "Charles Hemphill <charles@everspeech.com> (charles@everspeech.com)" <charles@everspeech.com>
On 06/30/2011 01:40 PM, Olli Pettay wrote: > On 06/30/2011 02:26 AM, Michael Bodell wrote: >> We will be going over the various API proposals on the HTML Speech call >> tomorrow. Kudos to Dan D for getting his submission in on time. Everyone >> one else who had things due earlier today (the cc list of this mail) >> should be sending them in ASAP. >> >> The plan for the call is to start with Raj’s work on the design >> decisions and requirements that are relevant for the API. After that we >> can move to the other proposals including the one from Dan and the one >> I’m submitting below. Raj’s is the most important to get done with >> first, which is why we are starting with that. >> >> Here is my proposal for the HTML bindings. For these we might extend the >> content attributes and interface with the other proposals (for >> specifying grammars, speech servers, events, etc.). In any case these >> element/interfaces could be created in JS or be present in the HTML >> document, or both. > > > I don't see anything in the proposal about permissions. > In which case is the recognizer activated? > > At least at the moment I'd still prefer similar mechanism what is > proposed for microphone capture. > In our case it could be > Speech.getRequest(successCallback, errorCallback); > > That could be extended to support the "for" attribute so that a speech > request could be associated with some part of the UI. > Speech.getRequestFor(element, successCallback, errorCallback) > > Then the parameter for successCallback would be the speechrequest object > which can be activated. And once the capturing APIs are stable, we could hook them up to recognizer using Streams. Then there was no reason, I think for getRequest(), but there could be just new SpeechInputRequest(stream, ...<other parameters>) (And sorry, I'm late with my item 8) -Olli > > > -Olli > >> >> ****** >> >> The reco element >> >> Categories >> >> Flow content. >> >> Phrasing content. >> >> Interactive content. >> >> Form-associated element. >> >> Contexts in which this element can be used: >> >> Where phrasing content is expected. >> >> Content model: >> >> Phrasing content, but with no descendant recoable elements unless it is >> the element's reco control, and no descendant reco elements. >> >> Content attributes: >> >> Global attributes >> >> form >> >> for >> >> DOM interface: >> >> [NamedConstructor=Reco(), >> >> NamedConstructor=Reco(in DOMString for)] >> >> interface HTMLRecoElement : HTMLElement { >> >> readonly attribute HTMLFormElement? form; >> >> attribute DOMString htmlFor; >> >> readonly attribute HTMLElement? control; >> >> }; >> >> The reco represents a speech input in a user interface. The speech input >> can be associated with a specific form control, known as the reco >> element's reco control, either using for attribute, or by putting the >> form control inside the reco element itself. >> >> Except where otherwise specified by the following rules, a reco element >> has no reco control. >> >> The for attribute may be specified to indicate a form control with which >> a speech input is to be associated. If the attribute is specified, the >> attribute's value must be the ID of a recoable element in the same >> Document as the reco element. If the attribute is specified and there is >> an element in the Document whose ID is equal to the value of the for >> attribute, and the first such element is a recoable element, then that >> element is the reco element's reco control. >> >> If the for attribute is not specified, but the reco element has a >> recoable element descendant, then the first such descendant in tree >> order is the reco element's reco control. >> >> The reco element's exact default presentation and behavior, in >> particular what its activation behavior might be and what implicit >> grammars might be defined, if anything, should match the platform's reco >> behavior. The activation behavior of a reco element for events targetted >> at interactive content descendants of a reco element, and any >> descendants of those interactive content descendants, must be to do >> nothing. When a reco element with a reco control is activated and gets a >> reco result, the default action of the recognition event should be to >> set the value of the reco control to the top n-best interpretation of >> the recognition (in the case of single recognition) or an appended >> latest top n-best interpretation (in the case of dictation mode with >> multiple inputs). >> >> reco . control: Returns the form control that is associated with this >> element. >> >> The form attribute is used to explicitly associate the reco element with >> its form owner. >> >> The htmlFor IDL attribute must reflect the for content attribute. >> >> The control IDL attribute must return the reco element's reco control, >> if any, or null if there isn't one. >> >> control . recos: Returns a NodeList of all the reco elements that the >> form control is associated with. >> >> Recoable elements have a NodeList object associated with them that >> represents the list of reco elements, in tree order, whose reco control >> is the element in question. The reco IDL attribute of recoable elements, >> on getting, must return that NodeList object. >> >> The form IDL attribute is part of the element's forms API. >> >> Two constructors are provided for creating HTMLRecoElement objects (in >> addition to the factory methods from DOM Core such as createElement()): >> Reco() and Reco(for). When invoked as constructors, these must return a >> new HTMLRecoElement object (a new reco element). If the for argument is >> present, the object created must have its for content attribute set to >> the provided value. The element's document must be the active document >> of the browsing context of the Window object on which the interface >> object of the invoked constructor is found. >> >> ********* >> >> I’m not sure if there’s any need for a TTS element, or if that can stay >> just JS only. If we need a TTS element it might be something like the >> following (again, we might expand the content attributes for the other >> aspects that the group is working on like eventhandlers, remote >> services, etc.): >> >> ********* >> >> The tts element >> >> Categories >> >> Flow content. >> >> Phrasing content. >> >> Embedded content. >> >> If the element has a controls attribute: Interactive content. >> >> Contexts in which this element can be used: >> >> Where embedded content is expected. >> >> Content model: >> >> If the element has a src attribute: zero or more track elements, then >> transparent, but with no media element descendants. >> >> If the element does not have a src attribute: one or more source >> elements, then zero or more track elements, then transparent, but with >> no media element descendants. >> >> Content attributes: >> >> Global attributes >> >> src >> >> crossorigin >> >> preload >> >> autoplay >> >> mediagroup >> >> loop >> >> muted >> >> controls >> >> DOM interface: >> >> [NamedConstructor=TTS(), >> >> NamedConstructor=TTS(in DOMString src)] >> >> interface HTMLTTSElement : HTMLMediaElement {}; >> >> A TTS element represents a synthesized audio stream. >> >> Content may be provided inside the TTS element. User agents should not >> show this content to the user; it is intended for older Web browsers >> which do not support TTS. >> >> In particular, this content is not intended to address accessibility >> concerns. To make TTS content accessible to those with physical or >> cognitive disabilities, authors are expected to provide alternative >> media streams and/or to embed accessibility aids (such as >> transcriptions) into their media streams. >> >> The TTS element is a media element whose media data is ostensibly >> synthesized audio data. >> >> The src, preload, autoplay, mediagroup, loop, muted, and controls >> attributes are the attributes common to all media elements. >> >> When a TTS element is potentially playing, it must have its TTS data >> played synchronized with the current playback position, at the element's >> effective media volume. >> >> When a TTS element is not potentially playing, TTS must not play for the >> element. >> >> tts = new TTS( [ url ] ) >> >> Returns a new TTS element, with the src attribute set to the value >> passed in the argument, if applicable. >> >> Two constructors are provided for creating HTMLTTSElement objects (in >> addition to the factory methods from DOM Core such as createElement()): >> TTS() and TTS(src). When invoked as constructors, these must return a >> new HTMLTTSElement object (a new tts element). The element must have its >> preload attribute set to the literal value "auto". If the src argument >> is present, the object created must have its src content attribute set >> to the provided value, and the user agent must invoke the object's >> resource selection algorithm before returning. The element's document >> must be the active document of the browsing context of the Window object >> on which the interface object of the invoked constructor is found. >> > > >
Received on Thursday, 30 June 2011 16:06:54 UTC