W3C home > Mailing lists > Public > www-i18n-comments@w3.org > November 2004

Re: Your comments on Character Model Fundamentals [LC071, LC073, LC077, LC078]

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Mon, 01 Nov 2004 15:40:42 +0100
To: "Richard Ishida" <ishida@w3.org>
Cc: <www-i18n-comments@w3.org>, w3c-i18n-ig@w3.org
Message-ID: <41bc4a95.611161874@smtp.bjoern.hoehrmann.de>

* Richard Ishida wrote:
>The following comment was accepted and edits were made along
>the lines you suggested. If you wish to say that you are satisfied
>or raise an issue, please reply to us within the next two weeks
>at mailto:www-i18n-comments@w3.org and copy w3c-i18n-ig@w3.org.
>        LC073



>Going back to the text of C016: "When designing a new protocol,
>format or API, specifications SHOULD require a unique character
>encoding.", we would like to point out that it doesn't require
>APIs across languages to use the same encoding. We would also
>like to point out that e.g. the DOM1 spec is very careful to
>avoid using the word API for DOM itself (see e.g.
>http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/). In

(I would suggest to look at more recent documents like DOM Level 3 Core
which refers quite a lot to "DOM API", etc.)

>addition, we would like to point out that C017, "When basing a
>protocol, format, or API on a protocol, format, or API that
>already has rules for character encoding, specifications SHOULD
>use rather than change these rules." would provide strong
>justification for a spec like DOM that would want to leave the
>question of character encoding to each language binding.

This does not satisfy me, the requirement is for APIs not for language
bindings and your remarks (which are concerned with language bindings)
thus do not apply. Nothing in charmod suggests an interpretation like
the one one above, it is very clear that the text attempts to encourage
the flawed requirement in the DOM specifications to which I object as
implementation experience clearly demonstrates that the requirement is
flawed. In fact, you correctly point out that most major languages now
have their own model for how to deal with Unicode and it is thus even
more inappropriate to make such requirements at the API level, the
requirement should thus be changed to

  Specifications of APIs SHOULD leave the question of character
  encoding to each language binding rather than requiring a unique
  character encoding.

>Decision: Rejected We have rejected this request because we feel that
>the uppercase string U+HHHH is inferior in appearance compared to the
>string U+hhhh and that the latter is more common when giving an example
>of Unicode Scalar Values.

I obviously disagree, U+hhhh looks very odd as a template for U+00F6
and it encourages writing U+00f6 which I do not desire. I am not too
concerned though...

>Decision: Rejected We think that it is a somewhat rare edge case that
>doesn't warrant additional complication in the conformance section. In
>the general case, every specification should try to conform to the
>character model, whether it also conforms to some other specification
>or not. In many cases, conformance to the character model also will
>come naturally for a derived spec.

It gets increasingly common that specifications can be re-used by other
specifications or that specifications allow for extensions, and it is
important to ensure that they operate on common grounds in terms of
their i18n aspects. It is important e.g. for extension designers that
specifications clearly require these extensions to conform to charmod
as such designers are unlikely to be aware of charmod or to know about
all the details they should know about when designing the extension. It
might be true that within W3C it is ensured that specifications conform
to the document, but charmod is explicitly set out to be useful for non-
W3C specifications for which this is much less likely. I thus object to
your response, I disagree that the additional "complication" is
unwarranted, I do not even think that it gets more complicated, rather
much clearer.
Received on Monday, 1 November 2004 14:41:28 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:20:15 UTC