Re: Your comments on Character Model Fundamentals [LC070, LC074, LC075, LC079, LC080, LC081, LC082, LC083, LC084, LC085, LC086, LC087, LC088, LC089] from Martin Duerst on 2004-10-29 (www-i18n-comments@w3.org from October 2004)

From: Martin Duerst <duerst@w3.org>
Date: Fri, 29 Oct 2004 13:55:03 +0900
To: Bjoern Hoehrmann <derhoermi@gmx.net>
Cc: www-i18n-comments@w3.org, w3c-i18n-ig@w3.org
Message-Id: <6.0.0.20.2.20041029133737.05571e50@localhost>

At 01:54 04/10/29, Bjoern Hoehrmann wrote:
 >
 >* Martin Duerst wrote:
 >>Well, if you want to upper-case a string, you will have to make
 >>that decision anyway. And then there might easily be cases where
 >>actually there is a precombined version for lc, but not for uc,
 >>or the other way round, so you have to deal with these anyway.
 >
 >The difference is however that in one case API designers have to deal
 >with it while in the other case API users have to deal with it.

So you want the user to deal with it? I'm really not sure
why we are disussing this at all, for upper-casing, the
right thing to do is to have a string->string function,
because that's the general case that you need anyway.
So the API designers/implementers have to deal with this
anyway.


 >Yet charmod states that "SS" is the "proper" result for uppercase('゜').

As you without any problem should understand, this is just an example,
in no way normative. Otherwise, we would have to discuss what
exactly 'proper' means, and so on.

 >>What about
 >>
 >>   Specifications of APIs SHOULD use strings rather than characters as
 >>   arguments and return types.
 >>
 >>I'm affraid that otherwise, we get into discussions of what exactly
 >>'prefer' means :-).
 >
 >Well, use it unless you know you don't "need" to... I guess I could live
 >with the text but I think requiring a preference rather than usage is
 >better. I don't think people would argue about using "prefer" here as
 >the text cannot be much clearer without getting more specific; all you
 >want is that API designers do not use chars by accident of false
 >assumptions...

'prefer' has clearly been shot down by Misha, who knows more about
English than the two of together :-). Given that, and the fact
that the remaining new proposal:
     Specifications of APIs SHOULD use strings rather than characters
     as arguments and return types.
(which by the way would have to read
     Specifications of APIs SHOULD use strings rather than single
     characters or single 'units of encoding' as arguments and return types.
to capture the full extent of the current text) is essentially the
same as the current text:
     Specifications of APIs SHOULD NOT specify single characters or
     single 'units of encoding' as arguments or return types.
I don't see any reason for a change.

 >>I don't know. It would definitely look better than keeping the
 >>sz in its lower case form, which would look weird. I know that
 >>CSS mentions cases such as the Dutch IJ for its :first-letter
 >>pseudo-class.
 >
 >Well, currently it is not realistic to expect implementations to do
 >that... I would agree in principle but I clearly prefer specifications
 >that get fully implemented.

Well, assuming that operating systems and libraries provide
string->string functions for this job, browsers and such
would just call the right function and be done with it.
Of course, any wrong perception that calling a characater->character
function does the job will delay good implementations. In that
sense, the character model does the right thing, or doesn't it?
The main goal is that implementations provide the right functionality.
Specifications are just a means to doing that.

Regards,    Martin.

Received on Friday, 29 October 2004 05:50:48 UTC