Re: non-ascii user name & password from Chris Newman on 1998-09-24 (ietf-http-wg@w3.org from October to December 1998)

From: Chris Newman <Chris.Newman@innosoft.com>
Date: Fri, 25 Sep 1998 00:57:57 +0100 (BST)
To: "Roy T. Fielding" <fielding@kiwi.ics.uci.edu>
Cc: http-wg@hplb.hpl.hp.com
Message-Id: <Pine.SOL.3.95.980924163105.24039V-100000@elwood.innosoft.com>

On Thu, 24 Sep 1998, Roy T. Fielding wrote:
> No, both would use octets for passwords and if either one does encoding
> translation then they may or may not interoperate, depending on how
> the user created the password in the first place.

The specific case is client A always encodes in 8859-1 and client B always
encodes in UTF-8.  A password with non-ASCII characters that works in A
won't work in B and vice versa.  So both A and B are compliant but they
don't interoperate.

> >Forbidding this situation is necessary to make sure all compliant clients
> >interoperate.
> 
> If that were true, they wouldn't interoperate now.  The fact is that
> everyone either uses ASCII passwords or continues to use the
> same charset for password entry that they used for password creation,
> which is not surprising.  None of the servers care about the encoding
> of the password characters.  None of the clients do encoding translation.
> That is why it works, even if it is sub-optimal.

Servers (excluding account management tools) shouldn't care about the
encoding of password characters.  That's why *OCTET is the right formal
syntax.  But the mapping from typed characters to *OCTET does matter.  If
it differs between any two clients (or a client and a password
administrative tool), there is an interoperability problem today.  The
very name "password" implies that it's usually textual (and thus made up
of characters) so a charset is needed.

> >P.S. Will the average user realize he has to manually configure the
> >"private agreement password charset" in his browser before he can
> >authenticate if he uses non-ASCII characters?
> 
> It is, by its very nature, the default.

Perhaps it is more often than not since users tend to prefer clients using
localized character sets.  But it's far from guaranteed.

> How do you think the average
> user will feel about all of his current password-enabled services being
> broken just to support a potential mismatch between system charsets?

Programs that rely on private agreements to interoperate deserve to break
(and occasionally do break in practice).

I know in email protocols the IETF has held a hard line and never
permitted unlabeled 8-bit text in a standard.  Is there something about
http that justifies breaking this precedent?  What do other people think? 
Is this a case where correct international interoperability has to be
sacrificed due to the localized private-agreement installed base?

I'm thinking of writing an RFC on i18n of usernames and passwords in
general and appreciate debates of the issues.

		- Chris

Received on Thursday, 24 September 1998 16:58:22 UTC