Re: I18N-ISSUE-246: Clarify character encoding behavior when calculating storage size [ITS-20] from John Cowan on 2013-02-27 (www-international@w3.org from January to March 2013)

From: John Cowan <cowan@mercury.ccil.org>
Date: Wed, 27 Feb 2013 03:02:54 -0500
To: Norbert Lindenberg <w3@norbertlindenberg.com>
Cc: Stephan Walter <stephan.walter@cocomore.com>, "www-international@w3.org" <www-international@w3.org>
Message-ID: <20130227080253.GT23465@mercury.ccil.org>

Norbert Lindenberg scripsit:

> It could be as simple as "If an ITS processor doesn't support the
> specified character encoding, it must report this as an error and
> terminate processing. If the selected nodes contain characters that
> the specified character encoding cannot represent, the processor must
> report this as an error and terminate processing." Or you could try
> and be nice in the second case and specify a fallback strategy, e.g.,
> by saying that the first replacement character among U+FFFD, U+003F,
> U+FF1F that can be represented in the specified character encoding
> must be used instead of any character that can't. 

The second strategy seems clearly superior.  But what encoding can
handle the fullwidth question mark but not the ASCII (halfwidth) one?
Unless there is one, we will never reach U+FF1F.

-- 
"Repeat this until 'update-mounts -v' shows no updates.         John Cowan
You may well have to log in to particular machines, hunt down   cowan@ccil.org
people who still have processes running, and kill them."

Received on Wednesday, 27 February 2013 08:03:18 UTC