W3C home > Mailing lists > Public > www-international@w3.org > October to December 2015

Re: More tests for dbl byte encodings in the Encoding spec

From: 신정식 <jshin1987+w3@gmail.com>
Date: Thu, 10 Dec 2015 12:58:21 -0800
Message-ID: <CAE1ONj95z45pbgj6FJAV523QnVFuzgUv9cx96R6kBN+o_dFbdw@mail.gmail.com>
To: r12a <ishida@w3.org>
Cc: www International <www-international@w3.org>, Anne van Kesteren <annevk@annevk.nl>, Jungshik Shin (신정식, 申政湜) <jungshik@google.com>
On Thu, Dec 10, 2015 at 11:49 AM, Jungshik SHIN (신정식) <
jshin1987+w3@gmail.com> wrote:

> Thanks for the test suite.
>
> I'm updating Blink's converters now to  the latest spec.
>
> While doing so, I found a bug in the spec. See
> https://github.com/whatwg/encoding/issues/21. Anne will take care of it
> quickly, I guess. After that, you have to run the test again for Japanese
> encodings :-)
>

In addition the above issue on U+2022 (spec bug), your Shift_JIS decoding
test has two more bugs. Firefox, Chrome and Opera all failed on the
following two code points (decoding) :

 U+A5 ¥ assert_equals: expected "¥" but got "\\"
 U+203E ‾ assert_equals: expected "‾" but got "~"

The above expected assumes that \x5C and \x7E should be decoded to U+00A5
and U+203E, but that's not the case. U+00A5 and U+203E are only mapped to
\x5C and \x7E when encoding to SJIS. See the summary below.

* round-trip mapping

\x5C <=> U+005C
\x7E <=> U+007E

* encoding (from Unicode) only mapping

\x5C <= U+00A5
\x7E <= U+203E

Jungshik

>
> Jungshik
>
> On Wed, Dec 9, 2015 at 3:20 AM, <ishida@w3.org> wrote:
>
>> the page
>>
>> http://www.w3.org/International/tests/repo/results/encoding-dbl-byte
>>
>> now points to a lot more tests, with results for major browsers, for the
>> double-byte encodings in the Encoding spec.
>>
>> the tests are grouped as follows:
>>
>> 1 does the browser encode characters as expected per the Encoding spec
>> when sending data with a form?
>>
>> 2 are characters that cannot be encoded using the Encoding spec
>> algorithms handled as expected?
>>
>>
>> 3 same as 1 for values constructed for an href attribute
>>
>> 4 same as 2 for href values
>>
>>
>> 5 are the characters that can be encoded decoded per the spec? (in the
>> case of euc-jp and big5, the characters tested include more than just those
>> that can be encoded)
>>
>> 6 are non-conformant byte sequences encountered during decoding dealt
>> with per the Encoding spec algorithms?
>>
>>
>> there is another page at
>>
>>
>> http://www.w3.org/International/tests/repo/results/encoding-dbl-byte-nightly
>>
>> which shows the results for the latest nightlies (although a small number
>> of tests in the other page haven't made it to this one yet)
>>
>>
>>
>> next step is to analyse why the failures are occuring.
>>
>> (The following app can be useful to diagnose the underlying issues
>> http://r12a.github.io/apps/encodings/ )
>>
>> ri
>>
>>
>
Received on Thursday, 10 December 2015 20:58:50 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:41:09 UTC