- From: Yngve N. Pettersen (Developer Opera Software ASA) <yngve@opera.com>
- Date: Tue, 26 Sep 2006 01:21:03 +0200
- To: "Bjoern Hoehrmann" <derhoermi@gmx.net>, "Julian Reschke" <julian.reschke@gmx.de>
- Cc: "HTTP authentication list" <ietf-http-auth@osafoundation.org>, "HTTP Working Group" <ietf-http-wg@w3.org>
On Tue, 26 Sep 2006 00:08:11 +0200, Bjoern Hoehrmann <derhoermi@gmx.net> wrote: > * Julian Reschke wrote: >> Speaking of which, where does HTML come into play here? We're talking >> about HTTP authentication à la RFC2617, not HTML forms based login. > > For HTML form submissions, unless the author indicated something else, > web browsers tend to use the character encoding of the document that in- > cludes the form to encode the characters; they could apply the same > logic to the submission of credentials when there is such a document > (e.g., the user clicked a link to a HTTP Auth protected page, the page > with the link could then be used to determine some encoding). Based on > my limited testing, I found this to be not the case. In the HTML case the information is directly available, as part of the form's own environment, which it would not be when the authentication credentials are being processed. Which characterset/encoding should the application choose when that information is not available? E.g. The information would not be available if the user goes directly to the site, by entering the URL directly. Similar situation could arise when the URL is loaded in another tab than the originating document, and obtaining the information would also not be entirely straight forward even if it is opened in the same tab. Or what about proxy authentication? Also: What if the original page is using a different characterset/encoding than used by the server requesting authentication? E.g. What if a Russian language page directs you to an authenticated Japanese site? And AFAIK several languages actually have multiple encodings. Also, characterset/encoding information may not be available in the HTTP header either. It is not possible to define a heuristic that will fit all scenarios. The best approach is to define a common characterset/encoding that will be used by all compliant servers. As RFC 2617 was not able to assist, the only guidance I had when I chose the I18N policy for Opera's RFC 2617 support, was RFC 2277/BCP 18, which I interprete to say that protocols should use UTF-8 unless they specify otherwise either in the specification (i.e. the RFC) or in a specific field of the protocol (in this case, that would mean an attribute in the WWW-Authenticate header). The problem is that work on the RFC 2617 protocol probably started before RFC 2217 was finished. Given that the current system is broken anyway, since client and server have to agree out-of-band on which characterset/encoding to use, it is in my opinion best to define a proper solution, which IMO means UTF-8, instead of trying to patch up the broken system . (And remember: Even a patch of the current system would have to be deployed in new clients and servers). -- Sincerely, Yngve N. Pettersen ******************************************************************** Senior Developer Email: yngve@opera.com Opera Software ASA http://www.opera.com/ Phone: +47 24 16 42 60 Fax: +47 24 16 40 01 ********************************************************************
Received on Monday, 25 September 2006 23:21:27 UTC