- From: Yngve N. Pettersen (Developer Opera Software ASA) <yngve@opera.com>
- Date: Wed, 26 Nov 2003 03:29:19 +0100
- To: ietf-http-wg@w3.org
Hi, Given Scott Lawrence's mention of upcoming drafts updating RFC 2616 and 2617 I thought I should raise this issue again for consideration. I previuously raised it in April 2003. Background: RFC 2617 does not mention which character set should be used when generating the authentication credentials. This has led to a situation where (at the very least) various clients have made different design choices, which have resulted in some interoperability problems. When I implemented Unicode support for authentication in Opera I chose UTF-8 as the encoding for the credential inputdata, based, among other reasons, on the recommendations of BCP 18/RFC 2277. We have received a couple of reports about problems caused by this. New information: Limited testing indicates that MSIE uses the character set selected in the View->Encoding menu when encoding Basic Authentication credentials. As the intention was only to get an idea about what kind of character sets where used, Mozilla/Netscape and other clients were not tested. If correct, this means that MSIE users can authenticate correctly as long as the server is using the same character set as they do, but as soon as the server is using a different character set they have to manually change the Encoding before being able to continue. And AFAIK, (disclaimer: I am not a character set expert) many of the Asian, Eastern European and Middle East nationalities have more than one character set to choose from, which means that even within a single nation you could run into problems. Other items: From Paul Leach (my summary+extra info) -------- Basic Authentication's username and password attributes are defined as "*TEXT", Digest Authentication's username parameter is an qouted string (essentially *TEXT) and passwd has no real definition, but probably *TEXT or *OCTET. RFC 2616 does say that if a *TEXT word contains non-iso-8859-1 characters they should be represented using the RFC 2047 rules (e.g =?charset?Q?text?= ). -------- I have, however, never seen the RFC 2047 QP syntax used by a HTTP client or server, and we could encounter problems regarding the Digest processing of passwords (Which charset is used? The same as in the username? Which encoding? Q or B? And what about precalculated A1 values?) From Alexey Melnikov (my summary) -------- RFC 2831 (Digest as SASL) introduced a charset attribute and complex rules to handle the situation. -------- What can be done? Shortrange: At the very least I think that the updated RFC 2617 draft should address the issue. Personally, I would prefer a permanent solution (one character set, or a negotiation method), but given the current implementation of clients and servers I suspect we may have to make do with something that effectively says "The client and server must be configured to use the same character set. How the configured character set is agreed upon is not defined by this specification". Longrange: A permanent solution (beside leaving the sitation as it is) can take several forms: 1) Updating the current methods by doing either of these: A) Define a standard character set to be used. B) Define a negotiation method, either client only, or client select from server's list. E.g. The client adds a charset attribute in the challenge response. Problem: How to solve backwards compatibility? 2) Define new internationalized authentication methods, at least for digest. All of these will require that both servers and clients are updated with new functionality, which will cause transition problems. Personally, of course, I'd prefer that UTF-8 is endorsed as the character set. -- Sincerely, Yngve N. Pettersen ******************************************************************** Senior Developer Email: yngve@opera.com Opera Software ASA http://www.opera.com/ ********************************************************************
Received on Tuesday, 25 November 2003 21:22:53 UTC