Re: I18N-ISSUE-246: Clarify character encoding behavior when calculating storage size [ITS-20] from Norbert Lindenberg on 2013-02-28 (www-international@w3.org from January to March 2013)

From: Norbert Lindenberg <w3@norbertlindenberg.com>
Date: Wed, 27 Feb 2013 22:17:26 -0800
To: John Cowan <cowan@mercury.ccil.org>
Cc: Norbert Lindenberg <w3@norbertlindenberg.com>, Stephan Walter <stephan.walter@cocomore.com>, "www-international@w3.org" <www-international@w3.org>
Message-Id: <EA6833C6-5C01-496C-AA6D-718D9B9CC629@norbertlindenberg.com>

On Feb 27, 2013, at 0:02 , John Cowan wrote:

> Norbert Lindenberg scripsit:
> 
>> It could be as simple as "If an ITS processor doesn't support the
>> specified character encoding, it must report this as an error and
>> terminate processing. If the selected nodes contain characters that
>> the specified character encoding cannot represent, the processor must
>> report this as an error and terminate processing." Or you could try
>> and be nice in the second case and specify a fallback strategy, e.g.,
>> by saying that the first replacement character among U+FFFD, U+003F,
>> U+FF1F that can be represented in the specified character encoding
>> must be used instead of any character that can't. 
> 
> The second strategy seems clearly superior.  But what encoding can
> handle the fullwidth question mark but not the ASCII (halfwidth) one?
> Unless there is one, we will never reach U+FF1F.

I don't know of such an encoding, but then I don't know all encodings. You may be right that U+FF1F will never be used.

Norbert

Received on Thursday, 28 February 2013 06:17:55 UTC