W3C home > Mailing lists > Public > www-voice@w3.org > July to September 2009

idiolectal pronunciation

From: Nick Levinson <nick_levinson@yahoo.com>
Date: Sun, 20 Sep 2009 15:20:07 -0700 (PDT)
Message-ID: <719281.30666.qm@web33507.mail.mud.yahoo.com>
To: www-voice@w3.org
We need a way to classify a group of words that are too numerous to itemize and assign a set of pronunciation rules to that group. 

Idiolectal pronunciations that have no grammatical or contextual cues other than who says them require selectors. (An idiolect is the language of one person; a dialect is what is common of the idiolects of two or more people who are of the same language community. Because English is spoken by a very large language community, its dialects combine into several standards, e.g., three standards in the U.S. and others in some other nations.)

This is for PLS 1.0, <http://www.w3.org/TR/pronunciation-lexicon/>.

I knew someone who had her own pronunciation of one word that often cropped up in her speech because of both her work and her nonwork life. Other than that word, while I'd recognize her voice, I don't recall a distinguishing accent. Were anyone to write a dialogue, we'd need a way to signify that the particular spelling represents a particular pronunciation only sometimes, a way that cannot be determined by syntax, sense, or part of speech.

A more common case in U.S. English is _nuclear_. Enough people repeatedly mispronounce it, including a former U.S. President, to constitute a use case. One big-city mayor and large-business owner consistently says /anyways/ in public where the orthography probably would be "anyway", and so do other people.

Aliases can be used, but we need a way to signify when to invoke them.

If role and xml:id don't serve that purpose, I propose the class and id attributes as used in (X)HTML. A class or id attribute would appear in the original Web page. The .pls document would specify one pronunciation for a given class or id and a default pronunciation when no class or id applies.

With the adoption of the id attribute, I propose also the adoption of the name attribute, to be reserved and denied any meaning. In (X)HTML, name is being phased out but is still widely deployed in older pages and in pages written to be compatible with older browsers, where typically both id and name are assigned the same values. Therefore, PLS cannot allow a different meaning for name than for id, unless it ignores name's value altogether.

The class would be handy for a dialogue, as in a movie script. In a dialogue, each person has thir own paragraphs, so each (X)HTML paragraph could be assigned a class, e.g. <p class="chris">. Selected words within the paragraphs could be given matching classes, e.g., <phoneme class="chris">.

Thank you.

-- 
Nick


      
Received on Sunday, 20 September 2009 23:56:25 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 20 September 2009 23:56:27 GMT