- From: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
- Date: Sun, 4 Dec 2011 11:20:24 +1100
- To: public-xg-htmlspeech@w3.org
As explained in another email before, here is the list of minor nitpicks that we have for the TTS part of the specification. These are so small, you may consider including them even now before finalizing the report. Best Regards, Silvia. Minor nitpicks: 1. "A TTS element represents a synthesized audio stream. A TTS element is a media element whose media data is ostensibly synthesized audio data." (A,S): Why use the word "ostensibly" in here when there is no other alternative? Just remove that word. 2."Content may be provided inside the TTS element. User agents should not show [(A,S) add: …or speak…] this content to the user; it is intended for older Web browsers which do not support TTS." 3. "In particular, this content is not intended to address accessibility concerns. To make TTS content accessible to those with physical or cognitive disabilities, authors are expected to provide alternative media streams and/or to embed accessibility aids (such as transcriptions) into their media streams." (S): What is a transcription of a SSML document? Why would content that is already provided as text need special accessibility support? (Cut and paste error from HTML spec?) Just remove this paragraph. 4. "Implementations should support at least UTF-8 encoded text/plain and application/ssml+xml (both SSML 1.0 and 1.1 should be supported)." (A): Earlier it says SSML 1.1 is mandatory (while here it is "should"): see section 6, requirement 13. Earlier it also doesn't mention SSML 1.0. 5. Speech Enabled Email Client Example: tts = new TTS(); function onMicClicked() { // stop TTS if there is any... if (tts.paused == false && tts.ended == false) { tts.controller.pause(); [(S): The controller is used only with mediagroups - I don't think you want that in here. Just use tts.pause();] } sr.start(); } } function readout(message) { tts.src = "data:,message from " + message.sendername + ". Message content: " + message.body; [(S): There is no mime type on this data URI. Should be data:text/plain,….] tts.play(); } 6. Speech XG Translating Example: function readResult(event) { document.getElementsById("p").value = 100; var tts = new TTS(); tts.serviceURI = "ws://example.org/speak"; tts.lang = document.getElementById("outlang").value; tts.src = "data:text/plain;" + translate(event.result.item(0).interpretation, document.getElementById("inlang").value, tts.lang); } (A): Instead of tts.src, why not just set tts.text = translate(event.result.item(0).interpretation, document.getElementById("inlang").value, tts.lang); ? 7. "7.2.4.1 Generic Headers …synthesisers will accept "application/ssml+xml"… " (S): earlier it says that synthesisers will also accept text/plain 8. "Audio-Codec = "Audio-Codec:" mime-media-type ; See [RFC3555]" (S): The protocol references RTP mime types for the audio codec. Since the data is, however, not transferred via RTP, it makes a lot more sense to reference normal file mime types and not streaming (un-encapsulated) data types. 9. "A.2.2.3 Web Authoring Convenience Synthesis Requirements" FPR13. It should be easy to assign recognition results to a single input field. This is a part of the expansion of requirement 3. FPR14. It should not be required to fill an input field every time there is a recognition result. This is a part of the expansion of requirement 3. FPR15. It should be possible to use recognition results to multiple input fields. (S): How are these "synthesis" requirements? Why not move them to the recognition section? 10. (A): Use case links in A.3 Initial Requirements don't work. (A): A few of the requirements had no use cases.
Received on Sunday, 4 December 2011 00:21:12 UTC