RE: SSML, further comments

Dear Dave,

Thank you again for your careful review of the SSML specification in 2001.
Again, for completeness, we have prepared responses to your requests from that
time.

If you believe we have not adequately addressed your issues with our
responses, please let us know as soon as possible.  If we do not hear
from you within 14 days, we will take this as tacit acceptance. 

Again, thank you for your input.

-- Dan Burnett
Synthesis Team Leader, VBWG

[VBWG responses are embedded, preceded by '>>>']

-----Original Message-----
From: www-voice-request@w3.org [mailto:www-voice-request@w3.org]On
Behalf Of DPawson@rnib.org.uk
Sent: Monday, January 22, 2001 5:06 AM
To: www-voice@w3.org
Subject: SSML, further comments


After another read of the spec. Some more comments.

 1.2, list item 4, para 3.
"TTS systems are expert at performing text-topohoneme conversions
so most words of most documents can be handled automatically".
 Rather too sweeping for my liking. Certainly not the case for
the systems I've seen :-)

>>> Proposed disposition:  (none yet)
>>> 
>>> Thank you for your comment. Do you have a specific suggestion
>>> for how to change this sentence?


2.4 Sub attribute.

A nice feature for a user would be to permit these to be collated
externally, and passed in as a sort of configuration file.

It would save typing for regularly repeated  occurrences.
<sub>
<el>W3C</el>
<use>World Wide Web Consortium</use>
</sub>

or something similar? 

>>> Proposed disposition:  Rejected
>>> 
>>> This request is similar to some earlier work by the Voice Browser
>>> Working Group on a standardized lexicon format (containing
>>> pronunciations for tokens and phrases). Your request is one that
>>> might best be considered for that effort if and when it re-activates.
>>> We encourage you to resubmit this request to the Working Group at
>>> that time.


2.8 break element.
  A refinement on this would be the ability to explicitly state
the required duration for various punctuation elements and other
break types (paragraph, sentance).

Again suggest this be externally configurable, for re-use optimisation.

>>> Proposed disposition:  Rejected
>>> 
>>> This concept has been considered for but rejected as part of SSML 1.0.
>>> Rather, we encourage the use of style sheets or transformations to
>>> enable this macro-like behavior. It is possible that future versions
>>> of SSML beyond version 1.0 could permit default value setting for items
>>> such as paragraph and sentence prosody, but this kind of manipulation
>>> today is discouraged by most commercial synthesis engine developers
>>> on anything other than the occasional basis enabled by the <break> element.


2.10 Usage note 1. Could be confusion between this and 3.2.
If the default is to pause conversion till the audio is complete,
then it should be explicitly stated here. I support that requirement btw.

>>> Proposed disposition:  Accepted
>>> 
>>> We have removed the Future Study text from the document. Playback
>>> of recorded audio occurs in sequence with preceding and following
>>> synthesis, matching what you prefer. To obtain background playing,
>>> mixing, etc. we would recommend using SMIL.


2.12 usage note. Why hasn't a namespace been explicitly called up?
  This would then nullify the requirement stated in 5.1 (I can't see
any need for that requirement. Is it justifiable?)

>>> Proposed disposition:  Accepted
>>> 
>>> The most recent draft of the specification contains a namespace
>>> definition and more careful conformance language with respect to
>>> non-standard extensions.


3.6 Value.

I suspect that the overall impact of this may be achieved by a simple
XSLT transform anyway, which may make this redundant?

>>> Proposed disposition:  Accepted
>>> 
>>> We have removed this text from the specification. Such functionality
>>> is expected to be achieved through the use of style sheets
>>> (ACSS/XSLT), as you suggest.


My only final comment is where can I find an implementation :-)
Good spec.

Regards DaveP
(AC, RNIB)

Received on Friday, 8 August 2003 20:11:49 UTC