RE: Pronunciation Lexicon Markup Requirements from Alex.Monaghan@Aculab.com on 2001-04-12 (www-voice@w3.org from April to June 2001)

From: <Alex.Monaghan@Aculab.com>
Date: Thu, 12 Apr 2001 13:46:56 +0100
To: frank.scahill@bt.com, www-voice@w3.org
Message-ID: <0AEF0EB21F09D211AE4E0080C82733BF0211D275@mailhost.aculab.com>
dear frank,
i can see the rationale behind all your responses, except the question of
"vendor-specific alphabets" (6.7).
my point was that not ALL vendor alphabets can be supported: do you allow
each vendor to support optional additional alphabets (which does not require
a standard, so why include it in this one?), or do you insist that everyone
who wishes to conform to the standard must support the Microsoft, Apple, IBM
and SUN alphabets (thus effectively discriminating against everyone else)?
it's a difficult question, i think!
				alex.

> -----Original Message-----
> From:	frank.scahill@bt.com [SMTP:frank.scahill@bt.com]
> Sent:	12 April 2001 11:42
> To:	alex.monaghan@Aculab.com; www-voice@w3.org
> Subject:	RE: Pronunciation Lexicon Markup Requirements
> 
> Alex,
>    Thanks for your comments, here are some responses
> 
> >2.1 - some of the issues which this document treats as unresolved (e.g.
> pronunciation alphabets and suprasegmental information) have already been
> decided for the Speech Synthesis Mark-up spec: interoperability presumably
> requires that these documents concur. 
> 
> Yes these documents will concur. Both SSML and GrammarML include some
> aspects of pronunciation specification that predate this work on a
> standard
> lexicon format.  The development of the pronunciation lexicon markup will
> build on what has already been done but may ultimately result in changes
> to
> the SSML and GrammarML. 
> 
> > 4.3 - syntactic category is only a "should have", but if this is absent
> i
> don't see how the treatment of homographs (4.6, a "must have") can be
> achieved. 
> The reason syntactic category is only a "should have" is that it may prove
> difficult to standardise the set of allowable values for this category,
> the
> more loosely defined "information field" (R4.4) was made a "must have" to
> then provide a way in which to distinguish homographs but not necessarily
> via a standardised set of categories. This may lead to vendor specific
> interpretations of what is in the information field, clearly this is less
> than ideal but the priorities were set according to what was thought
> achievable for a first draft
> 
> >5.10 - can we please distinguish between acronyms (those items, such as
> DEC
> or NASA, which are made up of capital letters but are pronounced as a
> word)
> and other sequences of capital letters such as all the examples given in
> this draft
>  
> This requirement was intended to mean a short hand pronunciation mechanism
> for acronyms that could be specified using other graphemes. Examples such
> as
> NASA would need to be represented using the full blown pronunciation
> mechanism, however examples such as BT could as represented as "b t" ( or
> even "bee tea") and rely as you say on their being an existing
> pronunciation
> for each letter/word. But a mechanism is still required to indicate "BT",
> "bt" and "b t" are equivalent, as you point out in your examples
> attempting
> to rely on some standard rules for handling of case is inadequate.
> 
> >6.7 - WHY?! 
> The requirements ask for vendor specific alphabets in addition to the
> standard pronunciation alphabets. The issue of whether vendors must
> support
> the standard pronunciation alphabets will be addressed by the compliance
> statement in the specification itself, which has yet to be defined. The
> intention is that the requirements cover the case where application
> developers wish to use a standard alphabet for content portability or
> where
> application developers wish to use a vendor specific alphabet for legacy
> or
> other reasons. 
> 
> 
> Regards
> Frank
Received on Thursday, 12 April 2001 08:47:00 UTC