W3C home > Mailing lists > Public > www-international@w3.org > January to March 2013

Re: I18N-ISSUE-246: Clarify character encoding behavior when calculating storage size [ITS-20]

From: Norbert Lindenberg <w3@norbertlindenberg.com>
Date: Wed, 27 Feb 2013 22:17:26 -0800
Cc: Norbert Lindenberg <w3@norbertlindenberg.com>, Stephan Walter <stephan.walter@cocomore.com>, "www-international@w3.org" <www-international@w3.org>
Message-Id: <EA6833C6-5C01-496C-AA6D-718D9B9CC629@norbertlindenberg.com>
To: John Cowan <cowan@mercury.ccil.org>

On Feb 27, 2013, at 0:02 , John Cowan wrote:

> Norbert Lindenberg scripsit:
> 
>> It could be as simple as "If an ITS processor doesn't support the
>> specified character encoding, it must report this as an error and
>> terminate processing. If the selected nodes contain characters that
>> the specified character encoding cannot represent, the processor must
>> report this as an error and terminate processing." Or you could try
>> and be nice in the second case and specify a fallback strategy, e.g.,
>> by saying that the first replacement character among U+FFFD, U+003F,
>> U+FF1F that can be represented in the specified character encoding
>> must be used instead of any character that can't. 
> 
> The second strategy seems clearly superior.  But what encoding can
> handle the fullwidth question mark but not the ASCII (halfwidth) one?
> Unless there is one, we will never reach U+FF1F.

I don't know of such an encoding, but then I don't know all encodings. You may be right that U+FF1F will never be used.

Norbert
Received on Thursday, 28 February 2013 06:17:55 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 28 February 2013 06:17:56 GMT