- From: Paul Leach <paulle@microsoft.com>
- Date: Mon, 21 Sep 1998 14:10:40 -0700
- To: 'Larry Masinter' <masinter@parc.xerox.com>
- Cc: http-wg@hplb.hpl.hp.com, Chris.Newman@innosoft.com
A careful re-reading of the digest spec shows that user-name is spec'd as quoted-string (not TEXT), and password is never interpreted by the message parser, just used to calculate the response from the challenge. So, we can use TEXT for the password; that leaves the question of encoding. > -----Original Message----- > From: Larry Masinter [mailto:masinter@parc.xerox.com] > Sent: Monday, September 21, 1998 1:10 AM > To: Paul Leach > Cc: http-wg@hplb.hpl.hp.com; Chris.Newman@innosoft.com > Subject: non-ascii user name & password > > > TEXT is inappropriate for user name and password, since: > > # The TEXT rule is only used for descriptive field contents and values > # that are not intended to be interpreted by the message parser. > > Whether or not it's typed, it's still a string that has to be parsed > and interpreted by the server. > > The problem is that UTF-8 doesn't quite have a well-defined > 'canonical' form yet, either, although one is being developed, the > canonicalization algorithm won't be at "draft standard". So you might > have two browsers that would enter the same user name with different > UTF-8 encodings, too. > > And we're not normally requiring clients to implement UTF-8 > transformations of user type-in at all so this will be a big problem. I don't think we need to require UTF-8, just say that USASCII is required, and UTF-8 is allowed. If you have a unicode password, then you need a UTF-8 capable browser; if not, you don't. > > On the other hand, it seems inappropriate to restrict user *names* to > US-ASCII. I wonder if we could change the BNF and description text > from "user name" and "username" to "user id", even if we leave > > username = "username" "=" user-id How does this help? Can we say that user-id in USASCII must be supported and UTF-8 may be supported if the user name contains characters not in the USASCII set? The UTF-8 encoding is well defined even if the canonical form isn't, isn't it? Then we don't reference the canonical form spec, and when that comes out everyone is cool. Or, I could say: ------------------ passwd = *OCTET It is the responsibility of the client implementation to make sure that the user can generate any octet string when providing the password. A protocol for setting and changing passwords (which is beyond the scope of this document) must specify how what the user provides maps to the actual octet string "on the wire". ------------------- Paul
Received on Monday, 21 September 1998 14:13:24 UTC