Re: Notes on the say-as note

On Wed, 01 Jun 2005 10:54:23 +0200, Pawson, David  
<David.Pawson@rnib.org.uk> wrote:

>     -----Original Message-----
>     From: Eira Monstad
>    The intention of the working group who wrote the say-as
>     note, according to
>     the paragraph from the note I quoted in my message.
>    > I always thought ISO was international?
>    It is international in its way, but its intended use is
>     different. ISO
>     8601 is meant to be used for reliable data interchange, not
>     to handle
>     common usecases in texts intended for humans. The working
>     group has
>     apparently acknowledged this problem, since they have
>     stated clearly in
>     the say-as note that the time format is *not* intended to
>     be iso 8601
>     compliant.
> And hence different from all other XML based date times?

Yes. As you can see in the note, this difference is clearly stated and  
explained.

> All I ask for is that the hour 24 is allowed, so
>     that more
>     Norwegian/Danish texts are covered, just like they allow
>     non-iso 8601
>     am/pm time formats to cover more English texts.
>
> And when another nation asks for their favourite exception?

Well, then one would have to consider the cost versus the benefit.

Remember that TTS systems always have language-specific rules for  
interpreting and speaking the text. An English-speaking TTS system might  
not make sense of 24 as a time, but a Norwegian-speaking one should  
absolutely be expected to. Then it would help if the defined say-as  
formats did not make the relevant markup invalid. IMHO, the current  
solution to this and a few other problems noted in my original message are  
too English-centric.

Vendors can of course create new formats in their own namespaces, but I'm  
no big fan of vendor-specific "standards" (an oxymoron if I ever saw one).

>    If the say-as time format were to be iso 8601 compliant, it
>     would be
>     unsuitable for a very large number of human-readable texts,
> But could be reliably converted to whatever format, by a machine.

What help would that be if the text the user wants to hear can't be  
interpreted because there's no markup to cover it? This is the exact  
problem say-as is meant to solve.

>     thereby
>     defying the purpose of the say-as element. The idea is to make the
>     contained text easy to understand for a machine even though
>     it was written
>     for a human.
>
> With the Norwegian use case, or some other use case?

All use cases.

>    I agree that following iso 8601 is a very good idea if you
>     are in control
>     of the time string, but this is about recognizing time
>     strings that were
>     never intended to be machine readable in the first place.
> How can you say that, when referring to authored content?

Because the intention of say-as is to make content intended for humans  
easier to interpret for a machine in order to speak it correctly. It's not  
meant to be a format for data interchange.

>     Well, the whole idea is that by marking up the text, you
>     won't have to
>     guess what it means...
>
>
> Only if the markup is reliable. What format would you expect an
> author in Taiwan to use? The Norwegian form? An East coast states
> format etc etc. Reliable markup uses standards, not edge cases.

But the time string is part of the content, not part of the markup. Humans  
shouldn't adapt to the machines, it should be the other way around. Now,  
machines aren't perfect, and sometimes we have to adapt to achieve what we  
want. But in this case, authors will write their content the way they want  
no matter what - the content authors will often not even have heard of  
SSML. So it's our choice - do we limit the allowed content and ignore the  
rest, giving the listening user a worse experience, or do we allow some  
edge cases where the cost is low and the benefit high - even if  
English-speaking users won't always notice the difference?

The markup doesn't have to be unreliable just because it doesn't follow  
iso 8601 (which it already does not). The important part is that it is  
well-defined, so that TTS systems know what to do when implementing  
support for it.


-- 
Eira Monstad
Core QA

Received on Wednesday, 1 June 2005 11:00:03 UTC