W3C home > Mailing lists > Public > www-voice@w3.org > January to March 2001

Comment on Pronunciation Lexicon Markup Requirements

From: Richard Sproat <rws@research.att.com>
Date: Wed, 21 Mar 2001 15:42:37 -0500
Message-Id: <200103212042.PAA07971@tabasco.research.att.com>
To: frank.scahill@bt.com
Cc: www-voice@w3.org

I have one comment on the following point:

       8.2 Prefix/Suffix morphological rules

       In some situations the explicit specification of all the
       morphological variants of a word can lead to extremely large
       lexicons. A standard scheme for providing prefix and suffix
       morphological rules would enable more compact lexicons. However
       it is felt that the most common use of the pronunciation
       lexicon markup will be for proper nouns where morphological
       variance is markup will be for proper nouns where morphological
       variance is less of an issue, and that standardisation of
       morphological rules will be too difficult to achieve in a first
       draft. Off-line tools may provide mechanisms for generating
       morphological variants.

It is likely that proper names would form a sizeable portion of the
words that users might like to mark up, but it is not true in general
that proper names do not undergo morphological variation.

For example, in Russian, many personal names undergo the same
inflectional processes as nouns, so that to correctly enter the
pronunciation of a Russian surname, for instance, one might end up
having to enter a whole bunch of morphological variants too.

I agree that coming up with a specification for specifying
morphological variants, for arbitrary languages, is going to be
difficult, but unfortunately the fact that this might mostly be used
for names doesn't really save us.

-- 
Richard Sproat               Human/Computer Interaction Research
rws@research.att.com         AT&T Labs -- Research, Shannon Laboratory
Tel: +1-973-360-8490         180 Park Avenue, Room B207, P.O.Box 971
Fax: +1-973-360-8809         Florham Park, NJ 07932-0000
----------------http://www.research.att.com/~rws/-----------------------
Received on Wednesday, 21 March 2001 15:43:26 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 30 October 2006 12:48:53 GMT