Re: Your comments on Character Model Fundamentals [LC070, LC074, LC075, LC079, LC080, LC081, LC082, LC083, LC084, LC085, LC086, LC087, LC088, LC089] from Martin Duerst on 2004-10-28 (www-i18n-comments@w3.org from October 2004)

From: Martin Duerst <duerst@w3.org>
Date: Fri, 29 Oct 2004 01:02:23 +0900
To: Bjoern Hoehrmann <derhoermi@gmx.net>
Cc: www-i18n-comments@w3.org, w3c-i18n-ig@w3.org
Message-Id: <6.0.0.20.2.20041029005121.1030b2c8@localhost>
At 00:13 04/10/29, Bjoern Hoehrmann wrote:
 >* Martin Duerst wrote:
 >> >[LC083]
 >
 >>The "descriptively rather than normatively" may not be very clear.
 >>[...]
 >
 >I consider both "SHOULD" and "MUST" "requirements" with the difference
 >that these denote different requirement levels, it is thus not clear
 >whether e.g. for "Specifications MUST require foo" "bar SHOULD foo" the
 >requirement has been met.

My personal understanding is that 'require' 'requires' and 'required'
would all express the same as MUST/REQUIRE. However, you are right
that 'requirements' is used for both 'SHOULD' and 'MUST'.

 >I want language in the document that clearly
 >allows or prohibes that. I am less concerned about how that is achieved.
 >
 >> >[LC084]
 >>
 >>The response is given in the table, and should also have been
 >>sent to you:
 >
 >My problem is that the mail claimed that "edits were made along the
 >lines you suggested" even though the table states that no edits have
 >been made.

This was an error, the 'noted' comment(s) were wrongly lumped in
with the 'accepted' comments.

 >So, are you saying I should assume that the text in the
 >table is more accurate?

Yes, definitely.

 >>In internationalization, it is always dangerous to assume that
 >>things work just because you don't know any exceptions.
 >
 >In computing, it is always dangerous to assume that things don't work
 >even though you do not know any exceptions, it might significantly
 >complicate the interface and implementation. For the uppercase example
 >taking a string would require to resolve whether
 >
 >  uppercase("o\x{308}")
 >
 >triggers an error condition, or would return "$B%h(B", "O\x{308}", "O", etc.

[sorry for the garbage resulting from my Japanese email client]

Well, if you want to upper-case a string, you will have to make
that decision anyway. And then there might easily be cases where
actually there is a precombined version for lc, but not for uc,
or the other way round, so you have to deal with these anyway.

 >and users of the routine would need to be aware that certain operations
 >would no longer be possible (like modifying the string without modifying
 >its length);

Well, yes, but that would already be the case if only the return
type is a string, for which we clearly have an example.

 >you also no longer have compile time type checking to
 >prevent certain kinds of errors, need more testing, longer documentation
 >and so on. So there is certainly a cost here.
 >
 >>It makes sense, it shows a specific example where the return type
 >>needs to be a string.
 >
 >But not the argument and whether there is actually such a need is
 >debatable, I know of quite a number of APIs where '$B%H(B' == uc '$Bg(B but
 >'$B!,(B' == uc '$B!,(B', so there are quite a number of people who disagree
 >that this is an absolute requirement

This is a question of how to define the actual action of the
function, not a question of what should be used as argument and
return types.

 >(it's also useful because
 >lc('$B!,(B') == lc(uc('$B!,(B')) holds true). Charmod does not contain a
 >requirement that specifications must define such transliteration
 >operations so that 'SS' == uc '$B!,(B' either.

No, it doesn't. First, this isn't charmod's business. To some extent,
it's Unicode's business. To some extent, it's application specific.

 >>Do you mean changing "Specifications of APIs SHOULD NOT specify single
 >>characters or single units of encoding as argument or return types."
 >>to something like "We RECOMMEND that specifications of APIs do not
 >>specify strings (rather than single characters or single units of encoding)
 >>as argument or return types."?
 >
 >I think text such as
 >
 >  Specifications of APIs SHOULD prefer strings over characters as
 >  arguments and return types.
 >
 >would satisfy me.

What about

   Specifications of APIs SHOULD use strings rather than characters as
   arguments and return types.

I'm affraid that otherwise, we get into discussions of what exactly
'prefer' means :-).

 >>I personally think that things like upper-case functions that do
 >>not change string length should be a way of the past. The Web
 >>is flexible and doesn't and shouldn't depend on fixed string lengths.
 >
 >Are there any relevant CSS implementations that render
 >
 >  <p style="text-transform: uppercase">$B!,(B</p>
 >  <p style="text-transform:      none">SS</p>
 >
 >the same? Are they required to in CSS 2.1/3.0?

I don't know. It would definitely look better than keeping the
sz in its lower case form, which would look weird. I know that
CSS mentions cases such as the Dutch IJ for its :first-letter
pseudo-class.


Regards,    Martin.
Received on Thursday, 28 October 2004 16:04:03 UTC