RE: non-ascii user name & password from Paul Leach on 1998-09-21 (ietf-http-wg@w3.org from October to December 1998)

From: Paul Leach <paulle@microsoft.com>
Date: Mon, 21 Sep 1998 14:10:40 -0700
To: 'Larry Masinter' <masinter@parc.xerox.com>
Cc: http-wg@hplb.hpl.hp.com, Chris.Newman@innosoft.com
Message-Id: <CB6657D3A5E0D111A97700805FFE65875D76C5@RED-MSG-51>

A careful re-reading of the digest spec shows that user-name is spec'd as
quoted-string (not TEXT), and password is never interpreted by the message
parser, just used to calculate the response from the challenge.

So, we can use TEXT for the password; that leaves the question of encoding.

> -----Original Message-----
> From: Larry Masinter [mailto:masinter@parc.xerox.com]
> Sent: Monday, September 21, 1998 1:10 AM
> To: Paul Leach
> Cc: http-wg@hplb.hpl.hp.com; Chris.Newman@innosoft.com
> Subject: non-ascii user name & password
> 
> 
> TEXT is inappropriate for user name and password, since:
> 
> # The TEXT rule is only used for descriptive field contents and values
> # that are not intended to be interpreted by the message parser. 
> 
> Whether or not it's typed, it's still a string that has to be parsed
> and interpreted by the server.
> 
> The problem is that UTF-8 doesn't quite have a well-defined
> 'canonical' form yet, either, although one is being developed, the
> canonicalization algorithm won't be at "draft standard". So you might
> have two browsers that would enter the same user name with different
> UTF-8 encodings, too.
> 
> And we're not normally requiring clients to implement UTF-8
> transformations of user type-in at all so this will be a big problem.

I don't think we need to require UTF-8, just say that USASCII is required,
and UTF-8 is allowed. If you have a unicode password, then you need a UTF-8
capable browser; if not, you don't.

> 
> On the other hand, it seems inappropriate to restrict user *names* to
> US-ASCII. I wonder if we could change the BNF and description text
> from "user name" and "username" to "user id", even if we leave
> 
>     username         = "username" "=" user-id

How does this help?

Can we say that user-id in USASCII must be supported and UTF-8 may be
supported if the user name contains characters not in the USASCII set? The
UTF-8 encoding is well defined even if the canonical form isn't, isn't it?
Then we don't reference the canonical form spec, and when that comes out
everyone is cool.

Or, I could say:
------------------
	passwd   = *OCTET

It is the responsibility of the client implementation to make sure that the
user can generate any octet string when providing the password. A protocol
for setting and changing passwords (which is beyond the scope of this
document) must specify how what the user provides maps to the actual octet
string "on the wire".
-------------------
 
Paul

Received on Monday, 21 September 1998 14:13:24 UTC