W3C home > Mailing lists > Public > www-i18n-comments@w3.org > October 2004

Re: Your comments on Character Model Fundamentals [LC070, LC074, LC075, LC079, LC080, LC081, LC082, LC083, LC084, LC085, LC086, LC087, LC088, LC089]

From: Martin Duerst <duerst@w3.org>
Date: Thu, 28 Oct 2004 18:48:44 +0900
Message-Id: <>
To: Bjoern Hoehrmann <derhoermi@gmx.net>, "Richard Ishida" <ishida@w3.org>
Cc: www-i18n-comments@w3.org, w3c-i18n-ig@w3.org

Hello Bjoern,

Here just some personal comments/clarifications.

At 03:33 04/10/28, Bjoern Hoehrmann wrote:
 >* Richard Ishida wrote:
 >>The following comments were noted or accepted and edits were made
 >>along the lines you suggested. If you wish to say that you are
 >>satisfied or raise an issue, please reply to us within the next
 >>two weeks at mailto:www-i18n-comments@w3.org and copy w3c-i18n-ig@w3.org.
 >>        LC074, LC081, LC082, LC083, LC084, LC086, LC089

 >I am not satisfied wrt to LC083, you note
 >  We have replaced 'mandate' with 'require' thoughout the document.
 >  'Require' is well defined in RFC 2119. Please note that we are not
 >  using upper-case in this case, because we are using 'require'
 >  descriptively, rather than normatively, in our spec.
 >If you do not use the terms as defined in RFC 2119 the term is not well-
 >defined. The term is further used to spell out normative requirements
 >(e.g. in C016) lacking a definition it is not clear what the requirement
 >actually means.

The "descriptively rather than normatively" may not be very clear.
What we meant was that 'require' in these cases is used with the
same RFC 2119 definition, but it is not capitalized because it
is not a requirement on the target of the normative language
(spec, implementation, or content), but a requirement on a
secondary target that the first target is required (or
recommended, as it may be) to impose, or a requirement
on a primary target is dependent on an other requirement,
or so.

It would look confusing, for example, if in C024, we wrote

"Content and software that label text data MUST use one of the
names REQUIRED by the appropriate specification..."
rather than
"Content and software that label text data MUST use one of the
names required by the appropriate specification".

 >LC084 is not marked as accepted but rather as "noted" without changes in
 ><http://www.w3.org/2004/02/charmod1-lastcall>, it is thus not clear to
 >me what your actual response to the issue is, I consider this comment
 >not addressed. I'll need to know your response to this issue in order to
 >consider whether your response to LC085 satisfies me.

The response is given in the table, and should also have been
sent to you:

We have decided to classify this comment as 'noted', which means that we 
think it raises a valid point, but does not merit changes to the current 
specification. With the example of XML, we have tried to make clear that 
rules that unambiguously lead to a determination of the character encoding 
to be used for decoding the document are not considered heuristics.

Whether it is a good idea to make the used character encoding depend on the 
way the document is loaded is a different issue, not addressed by C034, but 
such cases already exist (e.g. loading a document from a file system vs. 
serving it over the Web including meta information in HTTP headers). The 
case you mention, loading from a link in an existing document vs. 
idenpendent loading, is just an extension of the above case.

I hope this is explicit enough. If not, please tell us what you
don't understand.

 >LC086, I can't tell whether "We reworked many abbreviations" is
 >satisfactory, I would need to look at the text, but it should be
 >okay to assume it is.
 >LC089, okay.
 >>Decision: Partially accepted We removed 'legacy encoding' as a formally
 >>defined term from CharMod Fundamentals. We will revisit this for CharMod
 >That is the opposite of what I've asked for! You would also need to
 >remove all instances of the term (and similar terms if any) in the
 >document for that to make sense, if you do that I could live with
 >the resolution.

I was under the impression that we had removed all uses, but
that doesn't seem to be the case. We'll have to check this.

 >>However, we disagree with your counterexample. The fact that an 'uppercase'
 >>function can take a single character, even an sz, as an argument in some
 >>cases doesn't prove that there are no cases where it will not become
 >>necessary to hand over more than one character at a time to a function for
 >>proper uppercasing. Therefore, in general, both the arguments and the return
 >>type should be strings.
 >If there are cases where this would not work,

In internationalization, it is always dangerous to assume that
things work just because you don't know any exceptions.
Especially in cases like the one at hand, it's much easier
to just use a string argument than to try to check whether
there is really no exception, and be wrong in case one
overlooked something.

 >please change the example
 >in the specification to actually make sense.

It makes sense, it shows a specific example where the return type
needs to be a string.

 >You have failed to cite
 >reasons for keeping the SHOULD NOT rather than replacing it with e.g.
 >RECOMMENDED, so this does not satisfy me.

Do you mean changing "Specifications of APIs SHOULD NOT specify single 
characters or single units of encoding as argument or return types."
to something like "We RECOMMEND that specifications of APIs do not
specify strings (rather than single characters or single units of encoding)
as argument or return types."?

By the definition of SHOULD NOT and RECOMMEND, that would just be
the same: do it except if you have really good reasons not to.

I personally think that things like upper-case functions that do
not change string length should be a way of the past. The Web
is flexible and doesn't and shouldn't depend on fixed string lengths.

 >>Decision: Rejected We have decided to reject this comment because there may
 >>for example on occasion be historic reasons to mention these terms. Also,
 >>we would like to point out that this is just an issue of wording, not of
 >>interoperability, so there is no reason to be absolutely strict.
 >With "good" wording it is easy to understand, with "poor" wording it
 >is easy to misunderstand. Lack of interoperability is often caused by
 >misunderstandings, I thus do not buy it at all that this is not an
 >interoperability issue. If this is not an interoperability issue and
 >not meant to limit behavior which has potential for causing harm, the
 >document would not conform to RFC 2119 regardless of whether you use
 >SHOULD NOT or MUST NOT. Further, you can easily say that spec must not
 >use them in normative parts and should not use them in informative
 >parts, so this does not satisfy me.
 >I'll need to look at this more closely.

Please do so very soon!

Regards,    Martin. 
Received on Thursday, 28 October 2004 14:03:50 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 October 2009 08:32:35 GMT