Re: "phonemes" property in the CSS3 Speech module

I believe (custom) language codes are the best solution for pronunciation hints. This is somewhat similar to hyphenation.

  <html lang="en">
  <body lang="en-EU">
    <dialog lang="en-GB">
    <dt>Gavin</dt><dd lang="en-AU">My doctor said to eat 
      a <span lang="en-US">tomato</span> every day.</dd>
    <dt>Colm</dt><dd lang="en-IR"><em>My</em> doctor said to take 
      <em>my <span lang="en-GB">vitamin</span></em> pills every day.</dd>
    </dialog>
  </body>
  </html>

  @pronunciation "en" {src: url("/dict/en.prc");}
  @pronunciation "en-US", "en" {word-forms:
    tomato = "təˈmeɪtoʊ" /
    vitamin = "ˈvaɪtamɪn";
  }
  @pronunciation "en-GB", "en-AU", "en-EU" {word-forms:
    tomato = ipa(təˈmɑːtəʊ), sampa(t@"mA:t@U);
  }
  @pronunciation "en-GB", "en-EU" {word-forms:
    vitamin = "ˈvɪt.ə.mɪn";
  }
  @pronunciation "en-AU" {word-forms:
    vitamin = "ˈvaɪ.tə.mən";
  }
  @pronunciation "en-CA" {word-forms:
    tomato = "təˈmɛɪtoː", "təˈmætoː";
    /* two values are legal, but the second one is unused */
  }
  @pronunciation "en-X-nuspik" {word-forms:
    tomato = "ˈtɔməto" /
    vitamin = "ˈvɪˈtaːmɪn";
  }

  @hyphenation "de" {src: url("/dict/de.spl");
    word-forms: "Pä-per" /* not in dictionary, but common to all variants */;
  }
  @hyphenation "de-1900", "de-Latf" {word-forms:
    "Decke" = "Dek-ke" / /* mandatory ‘ck’ ligature in Fraktur split */
    "Liste" = "Li-ste" / /* mandatory ‘st’ ligature in Fraktur kept, 
    automatically includes long-s ‘ſ’ variant "Li-ſte" */
    /* using short form from here on */
    "Ma-gnet" / /* etymologic = morphologic, not phonemic */
    "Ma-gne-te";
  }
  @hyphenation "de-1996", "de-Latn" {word-forms:
    "De-cke" / "Lis-te" /
    "Mag-net", "Ma-gnet" / "Mag-ne-te", "Ma-gne-te"
  /*^ either of the options is valid, but only one of them must be used */;
  }
  @hyphenation "de-X-Paeper" {word-forms:
    "Dek-ke" / "Lis-te" / "Mag-net" / "Mag-ne-te";
  }

Due to time constrains I probably didn’t follow BCP47 and current CSS drafts correctly, which I should have done. IPA and SAMPA pronunciations were taken from Wiktionary.

U+2027 “Hyphenation Point” ‘·’ or U+007C “Vertical Line” ‘|’ could be used instead of U+002D “Hyphen-Minus” ‘-’ to indicate hyphenation opportunities, since the hyphen may occur inside words naturally. ‘word-forms’ should be a last resort property, ideally other properties are able to define regular hyphenation patterns.

Received on Friday, 4 February 2011 10:13:28 UTC