- From: Yngve Nysaeter Pettersen <yngve@opera.com>
- Date: Tue, 15 Apr 2003 23:26:45 +0200
- To: ietf-http-wg@w3.org
Hi, My name is Yngve N. Pettersen, I am a developer at Opera Software ASA, the company producing the Opera browser. One of my areas of responsibility is our HTTP protocol support. Some time ago, while implementing Opera's support for international character sets I discovered that RFC 2617 did not specify the character set to be used when encoding the username and password arguments for Basic and Digest authentication. Given that BCP 18/RFC 2277 strongly encouraged UTF-8 support in protocols, and that it may be impossible to determine the server's preferred characterset, among other reasons, I decided to use UTF-8 as the characterset when encoding the username and password before generating the authentication strings. Recently we received a report concerning problems with this way of generating authentication strings (apparantly other clients does not convert national characters in Western European languages, at least, I don't know how they treat Asian languages), and while researching the current state of the protocol, I noticed that the current errata does not address this point. I would therefore like to suggest that an item specifying which character set should be used when generating Basic and Digest authentication strings is added to the errata. My suggestion is that UTF-8 is selected as the character set used to encode the username and password values when creating the "user-pass" string (sec. 2) and the "username-value" and "passwd" strings in sec. 3.2.2. It might also be an idea to specify the same for other text attributes as well. As mentioned above BCP 18 indicates UTF-8 is the preferred charset for protocols. Additionally, I believe it would be very difficult to create a foolproof guessing method that would decide the charset based on such things as the charset of the authentication challenge response body, toplevel domain of the server, or the same from the referrer (if any), or the character set used on the client's computer (which may not match what is used on the server). As an example, the challenge may use a default message in English, while passwords and documents are encoded in a Japanese character set. I think the best way of avoiding (any further) ambiguities is to specify a single character set that MUST be used, and UTF-8 is the character set recommended by BCP 18. -- Sincerely, Yngve N. Pettersen ******************************************************************** Senior Developer Email: yngve@opera.com Opera Software ASA http://www.opera.com/ Phone: +47 24 16 42 51 Fax: +47 24 16 40 01 ********************************************************************
Received on Tuesday, 15 April 2003 17:21:55 UTC