- From: Masayasu Ishikawa <mimasa@w3.org>
- Date: Mon, 11 Sep 2000 12:47:45 +0900
- To: www-voice@w3.org
Hello,
I'm writing on behalf of the W3C Internationalization Working Group
(I18N WG).
The I18N WG recently held a face-to-face meeting, and reviewed
the "Speech Synthesis Markup Language Specification for the Speech
Interface Framework", published on 08 August 2000.
http://www.w3.org/TR/2000/WD-speech-synthesis-20000808
The following is a list of comments related to i18n. Other non-i18n
related comments will be sent separately.
==========
Problem with the document itself
The specification is written in XHTML 1.0 Transitional, served as
"text/html; charset=iso-8859-1", and includes the following lines:
<meta http-equiv="Content-Type"
content="text/html; charset=iso-8859-1" />
but it failed to add
<?xml version="1.0" encoding="ISO-8859-1"?>
at the beginning of the document.
2.2 "xml:lang" Attribute: Language
The "xml:lang" attribute is not defined at all in any of the elements
in the DTD! This is very serious problem, and must be fixed.
The spec says:
Following the XML convention, languages are indicated by an
"xml:lang"attribute on the enclosing element with the value
following RFC 1766 to define language codes.
The spec should not just mention RFC 1766, rather, should state as the
XML spec says. Note that the XML 1.0 spec has been modified in this
respect, please refer to E73 of the XML 1.0 Specification Errata [1],
and also "2.12 Language Identification" [2] of the XML 1.0 Second
Edition.
[1] http://www.w3.org/XML/xml-19980210-errata#E73
[2] http://www.w3.org/TR/2000/WD-xml-2e-20000814#sec-lang-tag
The spec also says:
Language information is inherited down the document hierarchy, i.e.
it has to be given only once if the whole document is in one
language, and language information nests, i.e. inner attributes
overwrite outer attributes.
But "it has to be given only once" is a bit too strict restriction.
According to this definition, the following example would be invalid:
<speak xml:lang="en-US">
... English words ...
<sayas xml:lang="en-US" sub="World Wide Web Consortium">W3C</sayas>
... English words ...
</speak>
But we don't think this is harmful. Actually the spec says in "Usage
note 3", that:
Where the "xml:lang" value is the same as the inherited value there
is no need for any changes in the voice or prosody.
This is true, so we don't think it's necessary to prohibit more than
one occurrences of the same value, even if the whole document is in
one language.
Of course, in general there's no need to duplicate the same value. But
for example, if someone has an XSLT stylesheet to transform every
occurrence of "W3C" to "<sayas xml:lang="en-US" sub="World Wide Web
Consortium">W3C</sayas>" regardless of the primary language of the
document, it would be much easier to just retain the "xml:lang"
attribute on that element rather than checking whether the whole
document is in "en-US" and if so having to remove the "xml:lang"
attribute on that element.
What's the rationale of this restriction?
2.4 "sayas" Element
The spec defines the pronunciation type "sub" as:
* sub: contained text is substituted for pronunciation with the
specified text. This allows a document to contain both a spoken
and written form.
This is quite similar to the purpose of ruby, so it might be
interesting to study the interoperability with the Ruby Annotation
spec [3].
[3] http://www.w3.org/TR/ruby
Also, currently the spec defines "sub" as an attribute, but sometimes
it might be desirable to add markup to the substituted text. For
example, someone might want to mark-up
<sayas sub="UCS Transformation Format">UTF</sayas>
and still want to specify that "UCS" is an acronym, but current syntax
doesn't allow this kind of markup. Or, if the substituted text is
multilingual, you can't specify the change of language within
attribute value. Note that for this kind of consideration, ruby markup
uses element for ruby annotation, though earlier proposal used
attribute.
2.5 "phoneme" Element
In all examples, "x" is missing in hexadecimal numeric character
references. For example, LATIN SMALL LETTER TURNED ALPHA (U+0252) must
be referenced as "ɒ", not "ü". "ü" is LATIN SMALL
LETTER U WITH DIAERESIS (U+00FC), which is definitely a different
character. An example notes that
<!-- This example uses the Unicode IPA characters. -->
<!-- Note: this will not display correctly on most browsers -->
but actually such a wrong example will not be displayed correctly on
ALL browsers.
2.6 "voice" Element
In examples at "Usage note 4", the spec uses unregistered language
codes like "en-cockney" and "en-brooklyn". It would be better to use
registered one (e.g. "en-scouse") in examples. IANA Registry of
Language Tags can be found at:
http://www.isi.edu/in-notes/iana/assignments/languages/
2.9 "prosody" Element
The rate attribute specifies the speaking rate in "words per minute",
but the notion of "word" may differ across languages. Relative values
lile "fast", "medium", "slow", "default" would be OK, but another
relative values like "+10" and "-5.5" might need careful
consideration.
5. DTD for the Speech Synthesis Markup Language
Other examples use the XML declaration like:
<?xml version="1.0"?>
but the DTD uses the following XML declaration:
<?xml version="1.0" encoding="ISO-8859-1"?>
but this DTD only uses Basic Latin characters, and we don't see why
this DTD has to be encoded in ISO-8859-1 or why it has to be different
from UTF-8 or UTF-16.
Also, it would be better to make this DTD available as
machine-readable form, rather than just including it in the middle of
the spec.
==========
Regards,
--
Masayasu Ishikawa / mimasa@w3.org
W3C - World Wide Web Consortium
Received on Sunday, 10 September 2000 23:47:49 UTC