- From: Masayasu Ishikawa <mimasa@w3.org>
- Date: Mon, 11 Sep 2000 12:47:45 +0900
- To: www-voice@w3.org
Hello, I'm writing on behalf of the W3C Internationalization Working Group (I18N WG). The I18N WG recently held a face-to-face meeting, and reviewed the "Speech Synthesis Markup Language Specification for the Speech Interface Framework", published on 08 August 2000. http://www.w3.org/TR/2000/WD-speech-synthesis-20000808 The following is a list of comments related to i18n. Other non-i18n related comments will be sent separately. ========== Problem with the document itself The specification is written in XHTML 1.0 Transitional, served as "text/html; charset=iso-8859-1", and includes the following lines: <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> but it failed to add <?xml version="1.0" encoding="ISO-8859-1"?> at the beginning of the document. 2.2 "xml:lang" Attribute: Language The "xml:lang" attribute is not defined at all in any of the elements in the DTD! This is very serious problem, and must be fixed. The spec says: Following the XML convention, languages are indicated by an "xml:lang"attribute on the enclosing element with the value following RFC 1766 to define language codes. The spec should not just mention RFC 1766, rather, should state as the XML spec says. Note that the XML 1.0 spec has been modified in this respect, please refer to E73 of the XML 1.0 Specification Errata [1], and also "2.12 Language Identification" [2] of the XML 1.0 Second Edition. [1] http://www.w3.org/XML/xml-19980210-errata#E73 [2] http://www.w3.org/TR/2000/WD-xml-2e-20000814#sec-lang-tag The spec also says: Language information is inherited down the document hierarchy, i.e. it has to be given only once if the whole document is in one language, and language information nests, i.e. inner attributes overwrite outer attributes. But "it has to be given only once" is a bit too strict restriction. According to this definition, the following example would be invalid: <speak xml:lang="en-US"> ... English words ... <sayas xml:lang="en-US" sub="World Wide Web Consortium">W3C</sayas> ... English words ... </speak> But we don't think this is harmful. Actually the spec says in "Usage note 3", that: Where the "xml:lang" value is the same as the inherited value there is no need for any changes in the voice or prosody. This is true, so we don't think it's necessary to prohibit more than one occurrences of the same value, even if the whole document is in one language. Of course, in general there's no need to duplicate the same value. But for example, if someone has an XSLT stylesheet to transform every occurrence of "W3C" to "<sayas xml:lang="en-US" sub="World Wide Web Consortium">W3C</sayas>" regardless of the primary language of the document, it would be much easier to just retain the "xml:lang" attribute on that element rather than checking whether the whole document is in "en-US" and if so having to remove the "xml:lang" attribute on that element. What's the rationale of this restriction? 2.4 "sayas" Element The spec defines the pronunciation type "sub" as: * sub: contained text is substituted for pronunciation with the specified text. This allows a document to contain both a spoken and written form. This is quite similar to the purpose of ruby, so it might be interesting to study the interoperability with the Ruby Annotation spec [3]. [3] http://www.w3.org/TR/ruby Also, currently the spec defines "sub" as an attribute, but sometimes it might be desirable to add markup to the substituted text. For example, someone might want to mark-up <sayas sub="UCS Transformation Format">UTF</sayas> and still want to specify that "UCS" is an acronym, but current syntax doesn't allow this kind of markup. Or, if the substituted text is multilingual, you can't specify the change of language within attribute value. Note that for this kind of consideration, ruby markup uses element for ruby annotation, though earlier proposal used attribute. 2.5 "phoneme" Element In all examples, "x" is missing in hexadecimal numeric character references. For example, LATIN SMALL LETTER TURNED ALPHA (U+0252) must be referenced as "ɒ", not "ü". "ü" is LATIN SMALL LETTER U WITH DIAERESIS (U+00FC), which is definitely a different character. An example notes that <!-- This example uses the Unicode IPA characters. --> <!-- Note: this will not display correctly on most browsers --> but actually such a wrong example will not be displayed correctly on ALL browsers. 2.6 "voice" Element In examples at "Usage note 4", the spec uses unregistered language codes like "en-cockney" and "en-brooklyn". It would be better to use registered one (e.g. "en-scouse") in examples. IANA Registry of Language Tags can be found at: http://www.isi.edu/in-notes/iana/assignments/languages/ 2.9 "prosody" Element The rate attribute specifies the speaking rate in "words per minute", but the notion of "word" may differ across languages. Relative values lile "fast", "medium", "slow", "default" would be OK, but another relative values like "+10" and "-5.5" might need careful consideration. 5. DTD for the Speech Synthesis Markup Language Other examples use the XML declaration like: <?xml version="1.0"?> but the DTD uses the following XML declaration: <?xml version="1.0" encoding="ISO-8859-1"?> but this DTD only uses Basic Latin characters, and we don't see why this DTD has to be encoded in ISO-8859-1 or why it has to be different from UTF-8 or UTF-16. Also, it would be better to make this DTD available as machine-readable form, rather than just including it in the middle of the spec. ========== Regards, -- Masayasu Ishikawa / mimasa@w3.org W3C - World Wide Web Consortium
Received on Sunday, 10 September 2000 23:47:49 UTC