W3C home > Mailing lists > Public > www-international@w3.org > October to December 2012

Re: Encoding Standard at F2F

From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Date: Mon, 05 Nov 2012 16:37:44 +0900
Message-ID: <50976CC8.3030506@it.aoyama.ac.jp>
To: "Jungshik SHIN (신정식)" <jshin1987@gmail.com>
CC: Norbert Lindenberg <w3@norbertlindenberg.com>, "Ishii, Koji a | Koji | EBJB" <koji.a.ishii@mail.rakuten.com>, Anne van Kesteren <annevk@annevk.nl>, "www-international@w3.org" <www-international@w3.org>
Hello Jungshik,

I think you should open one (or actually two) bugs against this spec.

On a more general note, the bugs are listed with Component: Encoding, 
Product: WHATWG. Shouldn't "Product" be something like I18N WG? (that 
would have to be added to the list of products first)

Regards,    Martin.

On 2012/11/05 7:21, Jungshik SHIN (신정식) wrote:
> On Sun, Nov 4, 2012 at 12:48 PM, Norbert Lindenberg<
> w3@norbertlindenberg.com>  wrote:
>
>> What about email archives on the web? I'd be surprised if there weren't
>> any that just take the bytes of subjects and bodies of email messages and
>> stuff them into HTML frames. Even Yahoo Mail and Hotmail did that until a
>> few years ago.
>>
>
> It's very unfortunate that two major web mail services had been broken in
> such a horrible way until a few years ago. ;-)  It's good for everybody
> that they have been fixed since.
>
> Anyway, I don't think that a potential existence of antiquated/broken email
> (list) archiving programs is a good justification to keep ISO-2022-KR (and
> GB-HZ).
>
> BTW, when ISO-2022-KR was around in 1990's, the dominant MUA of the time
> (sendmail) was patched (or combined with MDAs like procmail), at most
> Korean Unix hosts, to convert incoming emails in ISO-2022-KR to EUC-KR. So,
> the number of emails kept in ISO-2022-KR at mail(ing list) archives is much
> smaller than the actual number of emails exchanged in ISO-2022-KR.
>
> Jungshik
>
>
>
>
>>
>> Norbert
>>
>>
>> On Nov 4, 2012, at 19:23 , Ishii, Koji a | Koji | EBJB wrote:
>>
>>> I think the spec should cover all relevant technologies around W3C, not
>> only the web pages. I know little about how often ISO-2022-KR is used in
>> other places than Web, but you should also pay attention to e-mail and
>> other careers of W3C technologies.
>>>
>>> Microsoft once disabled automatic detection of ISO-2022-JP in MS10-090
>> for the security concern but turned it on again inMS11-003 due to its bad
>> impact. As you said and as Kuro confirmed, ISO-2022-JP is still an
>> important encoding for the W3C to support.
>>>
>>> Are you sure ISO-2022-KR and GB-HZ are not, considering all places W3C
>> technologies are used including e-mail, TV, etc.?
>>>
>>>
>>> Regards,
>>> Koji
>>>
>>> From: Jungshik SHIN (신정식) [mailto:jshin1987@gmail.com]
>>> Sent: Saturday, November 03, 2012 12:20 PM
>>> To: Anne van Kesteren
>>> Cc: www-international@w3.org
>>> Subject: Re: Encoding Standard at F2F
>>>
>>> Hi,
>>>
>>> Thank you for the note.
>>>
>>> I wonder what consideration has been given to the inclusion of
>> ISO-2022-KR and GB-HZ, two 7-bit encodings that are extremely rare on the
>> web (if used at all) and are 'security risks' (in a sense) like other 7-bit
>> encodings (e.g. UTF-7 that is not included).
>>>
>>> We cannot drop ISO-2022-JP lightly because it's still used somewhere
>> even though it's much less widely used than EUC-JP or Shift-JIS.
>>>
>>> OTOH, ISO-2022-KR has never been meant for the web and it's safe to say
>> that virtually no web page uses it. It's designed for emails (RFC 1557) in
>> early 1990's and it got out of favor  even for emails in late 1990's
>> because either EUC-KR (later UTF-8) with 8bit ESMTP or EUC-KR with
>> base64/qp worked just fine. For web pages, there's absolutely no reason to
>> use ISO-2022-KR from the beginning and it's not used.
>>>
>>> For the last 20 years, I've seen web pages (other than test pages) in
>> that encoding only once or twice. I'm a Korean speaker and I've visited
>> numerous web pages.
>>>
>>> To a slightly less extent, the same should hold for GB-HZ. It started
>> its life to use in Usenet (and email), but using that on the web does not
>> make much sense. I can't say about GB-HZ as strongly as about ISO-2022-KR,
>> but my experience with Chrome development (below) is an indication that
>> it's virtually unused.
>>>
>>> Chrome didn't support either of them until about 2 years ago. They're
>> added mainly because of http://encoding.spec.whatwg.org/  IIRC.  When
>> neither is supported, I haven't had any complaint from Chrome users.
>>>
>>> Jungshik
>>>
>>>
>>>
>>> 2012. 11. 3. 오전 7:31에 "Anne van Kesteren"<annevk@annevk.nl>님이 작성:
>>> I joined the I18N WG for an hour or so at their F2F in TPAC to discuss
>>> http://encoding.spec.whatwg.org/
>>>
>>> We basically went through the document for a high-level overview of
>>> what it attempts to do. We also concluded it is good enough to publish
>>> as a FPWD, provided someone in the I18N WG has the time to do the
>>> switch in style (from green to blue).
>>>
>>> Based on feedback from Richard Ishida and Kawabata Taichi during that
>>> meeting I filed these bugs:
>>>
>>> * https://www.w3.org/Bugs/Public/show_bug.cgi?id=19816
>>> * https://www.w3.org/Bugs/Public/show_bug.cgi?id=19817
>>>
>>> If there was any other feedback during that session I failed to
>>> capture I would appreciate if you could help me out. Issues with the
>>> specification are best recorded in Bugzilla:
>>>
>> https://www.w3.org/Bugs/Public/enter_bug.cgi?product=WHATWG&component=Encoding
>>>
>>>
>>> --
>>> http://annevankesteren.nl/
>>>
>>
>>
>
Received on Monday, 5 November 2012 07:38:15 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 5 November 2012 07:38:16 GMT