Fwd: Re: [encoding] Last Call Comment: Arithmetic Right Shift

Anne alerted me of the fact that I sent the mail below to him only but 
might have wanted to send it to the mailing list, which I indeed wanted 
and am doing herewith.

Regards,   Martin.

-------- Original Message --------
Subject: Re: [encoding] Last Call Comment: Arithmetic Right Shift
Date: Sun, 06 Jul 2014 16:53:43 +0900
From: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
To: Anne van Kesteren <annevk@annevk.nl>



On 2014/07/02 21:05, Anne van Kesteren wrote:
> On Wed, Jul 2, 2014 at 1:15 PM, "Martin J. Dürst"
> <duerst@it.aoyama.ac.jp> wrote:
>> Looking at the explanation about Arithmetic Shifts at
>> https://en.wikipedia.org/wiki/Arithmetic_shift, the exact effect of
>> Arithmetic Right Shifts depends on the size of the data type.
>
> Do you have a more specific pointer?

Just a simple example, assuming a size of 8 bits.

00011111 >> 3   gives 0000011
11111111 >> 3   gives 1111111

For more details, maybe compare
https://en.wikipedia.org/wiki/File:Rotate_right_arithmetically.svg and
https://en.wikipedia.org/wiki/File:Rotate_right_logically.svg.


>> I have not found any indication in the spec about the (minimal) size of the
>> data type needed for the calculations to work correctly (sorry if I missed
>> it). This could be 16 bits, or 32 bits, or potentially even more (e.g. for
>> GB-18030).
>
> It is only used by utf-8 and utf-16 and only on code points, so 21 bits.

With 21 bits, it may happen that the most significant bit is 1, and a
right shift will produced more 1s, which I don't think we want. With 22
bits, we should be on the safe side. We could use logical shifts
(https://en.wikipedia.org/wiki/Logical_shift), then a size of 21 bits
should do. But it's not something we need to optimize for.

>> It would be highly desirable to indicate the minimum necessary width of the
>> data type needed for correct operation.
>
> I would prefer to avoid talking about the exact representation of the operand.

I very much agree that we don't want to talk about the *exact*
representation of the operand. What I'm proposing is to talk about the
*minimum* size of the operands necessary for correct operation.

Regards,   Martin.

Received on Friday, 11 July 2014 11:03:15 UTC