- From: Scott Lawrence <scott-http@skrb.org>
- Date: Tue, 25 Nov 2003 22:31:34 -0500
- To: "Yngve N. Pettersen (Developer Opera Software ASA)" <yngve@opera.com>
- Cc: ietf-http-wg@w3.org
"Yngve N. Pettersen (Developer Opera Software ASA)" <yngve@opera.com> writes: > Background: > > RFC 2617 does not mention which character set should be used when > generating the authentication credentials. This has led to a situation > where (at the very least) various clients have made different design > choices, which have resulted in some interoperability problems. > > When I implemented Unicode support for authentication in Opera I chose > UTF-8 as the encoding for the credential inputdata, based, among other > reasons, on the recommendations of BCP 18/RFC 2277. We have received a > couple of reports about problems caused by this. There's an important distinction here - the character set is not the same as the encoding. This mail is in the character set 'iso-8859-1'; it specifies a particular set of glyphs. The encoding is how you represent that character set with bits. > From Paul Leach (my summary+extra info) > -------- > Basic Authentication's username and password attributes are defined as > "*TEXT", Digest Authentication's username parameter is an qouted string > (essentially *TEXT) and passwd has no real definition, but probably *TEXT > or *OCTET. > > RFC 2616 does say that if a *TEXT word contains non-iso-8859-1 characters > they should be represented using the RFC 2047 rules (e.g =?charset?Q?text?= > ). So 2617 already has a specified (if clunky and unevenly supported) solution for Basic. I don't see sufficient reason to change it, since the worst thing about 2047 rules is that they are not human-friendly, but this is not for humans anyway. But 2617 is mute on the encoding of the inputs to the A1 hash values, and if the same characters are hashed using different encodings in the client and the server, then the hash (very very probably) will not match, so I believe a case can be made that Digest is underspecified. > What can be done? > > Shortrange: At the very least I think that the updated RFC 2617 draft > should address the issue. Agreed. > Personally, I would prefer a permanent solution (one character set, or a > negotiation method), but given the current implementation of clients and > servers I suspect we may have to make do with something that effectively > says "The client and server must be configured to use the same character > set. How the configured character set is agreed upon is not defined by this > specification". I think that the character set isn't important - only the encoding; they both already (presumably) share the secret, they just have to agree on a common representation of it. > Problem: How to solve backwards compatibility? > > 2) Define new internationalized authentication methods, at least for > digest. > > Personally, of course, I'd prefer that UTF-8 is endorsed as the character > set. Actually, I think this choice would be backward-compatible for those cases that have been working, and incompatible only with those that previously would have had interoperability problems anyway (passwords that had values outside the ASCII range). -- Scott Lawrence http://skrb.org/scott/
Received on Tuesday, 25 November 2003 22:29:39 UTC