W3C home > Mailing lists > Public > www-i18n-comments@w3.org > October 2004

Your comments on Character Model Fundamentals [LC054, LC055, LC056, LC057, LC058]

From: Richard Ishida <ishida@w3.org>
Date: Wed, 6 Oct 2004 11:17:05 +0100
To: "'Karl Dubost'" <karl@w3.org>
Cc: <www-i18n-comments@w3.org>, <www-qa-wg@w3.org>
Message-Id: <20041006101705.42CF24EFB2@homer.w3.org>

Dear Karl,

Many thanks for your comments on the 3rd Last Call version of the Character Model for the World Wide Web 1.0: Fundamentals [1].  We appreciate the interest you have taken in this specification.

You can see the comments you submitted, grouped together, at http://www.w3.org/International/Group/2004/charmod1-lc/SortByOriginator.html#LC054
(You can jump to a specific comment in the table by adding its ID to the end of the URI.)


PLEASE REVIEW the decisions for the following comments and reply to us within the next two weeks at mailto:www-i18n-comments@w3.org (copying w3c-i18n-ig@w3.org) to say whether you are satisfied with the decision taken. 
        LC054, LC055, LC056, LC057, LC058

Information relating to these comments is included below. You will receive notification of decisions on remaining comments at a later date.

These comments relate to the editor's version at http://www.w3.org/International/Group/charmod-edit/charmod1.html

Best regards,
Richard Ishida, for the I18N WG




DECISIONS REQUIRING A RESPONSE
==============================

LC054
http://www.w3.org/International/Group/2004/charmod1-lc/SortByOriginator.html#LC054
Decision: Partially accepted For 3.2, C001; 3.3, C002; 3.4, C005; 3.6, C009: replaced "MUST NOT assume" with "MUST NOT require or depend on".

We have changed the wording from: "C001 [S][I][C] Specifications, software and content MUST NOT >>assume that there is<< a one-to-one correspondence between characters and the sounds of a language." to "C001 [S][I][C] Specifications, software and content MUST NOT >>require or depend on<< a one-to-one correspondence between characters and the sounds of a language."

This avoids the issue that specifications, implementation, and content don't really make 'assumptions'.

As for conformance, we would like to first point out that all the conformance criteria in the Character Model are predicated on whether a given criteria actually applies to a give technology. So technology that does not deal with the auditory representation of language (i.e. most W3C specifications) are not affected by this criterion. Technology that is affected (e.g. VoiceXML and in particular SSML) can be checked.

If SSML for example tried to do text-to-speach conversion by defining a format for a table that would only associate single phonemes with single characters, it would very clearly not conform to the character model. But as you can check at http://www.w3.org/TR/2004/REC-speech-synthesis-20040907/#S3.1.9, SSML definitions of written to spoken correspondence using the <phoneme> element allows definitions on whole words or larger pieces of text, so it is conformant. With this example, I hope that we have shown that conformance of specifications can indeed be checked.

To be even more concrete, one could easily collect a series of examples (starting with those mentioned in the spec, such as "thing"), where there is not a one-to-one correspondence between characters and phonemes, and check whether specs, implementations,... that deal with such correspondences can handle them.

As for implementability, there are a lot of text-to-speech engines, and a lot of speech detection engines, that do not require or depend on a one-to-one correspondence, so it is very clear that this can be implemented.

As for your point of "If the software implements only this language because it's a specific use for only this language", yes, such a software would not conform to the character model. From the viewpoint of the character model, this would be on purpose; in the age of the World Wide Web, it is a bad idea to create software that can handle only one language, and it is a bad idea to create software that has language-related issues hard-coded when it can easily be made configurable.



LC055
http://www.w3.org/International/Group/2004/charmod1-lc/SortByOriginator.html#LC055
Decision: Partially accepted Our reply is basically the same as that for LC054.

We replaced "MUST NOT assume" with "MUST NOT require or depend on". We note that this is testable with very simple examples, some of which can be found in the spec itself. Implementations dealing with only a single language may not conform to the character model, and that is by design; it's the goal of the character model to make sure that specs and software can deal with as much languages as possible.


LC056
http://www.w3.org/International/Group/2004/charmod1-lc/SortByOriginator.html#LC056
Decision: Partially accepted Our reply is basically the same as that for LC054.

We replaced "MUST NOT assume" with "MUST NOT require or depend on". We note that this is testable with very simple examples, some of which can be found in the spec itself. Implementations dealing with only a single language may not conform to the character model, and that is by design; it's the goal of the character model to make sure that specs and software can deal with as much languages as possible.


LC057
http://www.w3.org/International/Group/2004/charmod1-lc/SortByOriginator.html#LC057
Decision: Rejected You write: ===> What's happening if you implement all western languages but not asian because the context of applications do not make it necessary. Do I still have to implement everything? If not how can I be conformant?

As we have already explained in our responses to LC054-56 that the goal of the character model is to cover as many languages/scripts/ characters as possible. On the WWW, you never know what input you get. If an implementation blows up just because it is unable to do anything with Asian characters, that would be very bad. Please note that we do not require any particular sort order for any character, simply sorting 'unknown' characters by codepoint would be okay.


LC058
http://www.w3.org/International/Group/2004/charmod1-lc/SortByOriginator.html#LC058
Decision: Partially accepted Our reply is basically the same as that for LC054.

We replaced "MUST NOT assume" with "MUST NOT require or depend on". We note that this is testable with very simple examples, some of which can be found in the spec itself. Implementations dealing with only a single language may not conform to the character model, and that is by design; it's the goal of the character model to make sure that specs and software can deal with as much languages as possible.




USEFUL LINKS
==============
[1] The version of CharMod you commented on: 
http://www.w3.org/TR/2004/WD-charmod-20040225/
[2] Latest editor's version (still being edited): 
http://www.w3.org/International/Group/charmod-edit/charmod1.html
[3] Last Call comments table, sorted by ID: 
http://www.w3.org/International/Group/2004/charmod1-lc/
Received on Wednesday, 6 October 2004 10:17:07 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 October 2009 08:32:35 GMT