W3C home > Mailing lists > Public > www-i18n-comments@w3.org > October 2004

Your comments on Character Model Fundamentals [LC043, LC044, LC046, LC047]

From: Richard Ishida <ishida@w3.org>
Date: Wed, 6 Oct 2004 11:28:46 +0100
To: <connolly@w3.org>
Cc: <www-i18n-comments@w3.org>
Message-Id: <20041006102846.3BEDC4F1B6@homer.w3.org>

Dear Dan,

Many thanks for your comments on the 3rd Last Call version of the Character Model for the World Wide Web 1.0: Fundamentals [1].  We appreciate the interest you have taken in this specification.

You can see the comments you submitted, grouped together, at http://www.w3.org/International/Group/2004/charmod1-lc/SortByOriginator.html#LC043
(You can jump to a specific comment in the table by adding its ID to the end of the URI.)

The following comments were accepted and edits were made along the lines you suggested. If you wish to say that you are satisfied or raise an issue, please reply to us within the next two weeks at mailto:www-i18n-comments@w3.org and copy w3c-i18n-ig@w3.org.
        LC046, LC047

PLEASE REVIEW the decisions for the following additional comments and reply to us within the next two weeks at mailto:www-i18n-comments@w3.org (copying w3c-i18n-ig@w3.org) to say whether you are satisfied with the decision taken. 
        LC043, LC044

Information relating to these comments is included below.

These comments relate to the editor's version at http://www.w3.org/International/Group/charmod-edit/charmod1.html

Best regards,
Richard Ishida, for the I18N WG


Decision: Partially accepted We have changed the wording from:

C001 [S][I][C] Specifications, software and content MUST NOT >assume that there is< a one-to-one correspondence between characters and the sounds of a language.


C001 [S][I][C] Specifications, software and content MUST NOT >require or depend on< a one-to-one correspondence between characters and the sounds of a language.

and have made the same change for C002 and C003. This avoids the issue that specifications, implementation, and content don't really make 'assumptions'.

As for conformance, we would like to first point out that all the conformance criteria in the Character Model are predicated on whether a given criteria actually applies to a give technology. So technology that does not deal with the auditory representation of language (i.e. most W3C specifications) are not affected by this criterion. Technology that is affected (e.g. VoiceXML and in particular SSML) can be checked.

If SSML for example tried to do text-to-speach conversion by defining a format for a table that would only associate single phonemes with single characters, it would very clearly not conform to the character model. But as you can check at http://www.w3.org/TR/2004/REC-speech-synthesis-20040907/#S3.1.9, SSML definitions of written to spoken correspondence using the <phoneme> element allows definitions on whole words or larger pieces of text, so it is conformant. With this example, I hope that we have shown that conformance of specifications can indeed be checked.

To be even more concrete, one could easily collect a series of examples (starting with those mentioned in the spec, such as "thing"), where there is not a one-to-one correspondence between characters and phonemes, and check whether specs, implementations,... that deal with such correspondences can handle them.

Decision: Rejected The definition for 'character' currently available in the document ("a character can be defined informally as a small logical unit of text") is too fuzzy to be directly useful in other specifications. Having a single, very precise, definition of 'character' is not really feasible, because different kinds of specifications may need different definitions. Also, in C067, we advise to use more specific terms if available. The wide range of ways to look at the phenomenon of a 'character, and to define the term 'character', should become obvious to the reader after reading Section 3 of the Character Model.

[1] The version of CharMod you commented on: 
[2] Latest editor's version (still being edited): 
[3] Last Call comments table, sorted by ID: 
Received on Wednesday, 6 October 2004 10:28:47 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:20:15 UTC