W3C home > Mailing lists > Public > www-international@w3.org > July to September 2001

RE: UTF-7 and java

From: Barry Caplan <bcaplan@i18n.com>
Date: Tue, 28 Aug 2001 09:34:31 -0700
Message-Id: <5.0.2.1.2.20010828091716.0bcbdc30@shell11.ba.best.com>
To: <www-international@w3.org>
At 08:45 AM 8/28/2001 -0700, Carl W. Brown wrote:

>This is not true.  For example look at iso-2022.  This code page exists
>because it is 7 bit clean.

Well, originally anyway. I think it was of the same vintage as SMTP or even 
earlier - in the days when 7 bits only really mattered.


>It is much more efficient than base64 encoding.

In what sense? Bandwidth? It is easy to beat base64 on this account because 
if I remember correctly (an I might not and I just woke up) base 64 is 
actually a *6 bit* encoding.


>It is a mess to process and many of use would like to see it die a horrible
>death.

Count me in!

>   This is because if RFC 821.

There is another RFC, number I don't recall which specifies a variant of 
1022 for Japanese email. It is circa 1991 I think. Obviously obsolete. But 
last year, I had some rather interesting discussions with Japanese partners 
wrt how this is the de facto standard for Japanese email, even if everyone 
agrees it is not necessary anymore.

No Japanese company will market a email client that does not support it 
because apparently the typical Japanese customer is aware of this RFC via 
the meme pipe, despite never having used email until well after it was 
effectively obsolete. Japanese folks really believe that if the headers 
don't say (where else do average users read the headers btw!) 1022-JP in 
it, the message is *WRONG* even though they are reading the content with 
their own eyes! This reflects badly on the program and the supplier of the 
program.

As I found out, it is not a technical issue at all, but a deeply seated 
cultural one that I couldn't even begin to guess how to change in the short 
term. I see this as a major impediment towards Unicode ever being adopted 
worldwide as a de facto encoding for email. In fact, it is preventing even 
Shift-JIS from being adopted in Japan, even though 99%+ of the clients and 
servers that process email in Japan can support it today.


>The SMTP transports do not deal with
>IS0-2022 but your application does.


Correct. I agree.


>It is mostly because they can not be sure that the last 7 bit modem is not
>still out there some where.  Quote from RFC 1341:
>
>"Several of the mechanisms described in this document may seem somewhat
>strange or even baroque at first reading. It is important to note that
>compatibility with existing standards AND robustness across existing
>practice were two of the highest priorities of the working group that
>developed this document. In particular, compatibility was always favored
>over elegance. "
>
>Meaning forget any changes that are not backwardly compatible (support 7
>bit)

It would be an interesting exercise to be able to query all the email 
servers of the world and find out if they support ESMTP or not. ESMTP 
servers will fall back to SMTP if necessary. An ongoing survey showing the 
number of SMTP-only servers starting with a small number and dwindling 
might encourage the remaining few to switch or suffer the occasional 
consequences.

If someone has the bandwidth and CPU power to offer (I don't right now) I 
will take a crack at a Perl script to search out mail servers from MX 
records and query them.

Barry Caplan
www.i18n.com
Received on Tuesday, 28 August 2001 12:37:57 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:57 GMT