Re: RFC 2617 Authentication and character sets revisited from Yngve Nysaeter Pettersen on 2003-11-26 (ietf-http-wg@w3.org from October to December 2003)

From: Yngve Nysaeter Pettersen <yngve@opera.com>
Date: Wed, 26 Nov 2003 19:46:46 +0100
To: Scott Lawrence <scott-http@skrb.org>
Cc: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Message-ID: <opry9c38iyx6onhr@localhost>

On Tue, 25 Nov 2003 22:31:34 -0500, Scott Lawrence <scott-http@skrb.org> 
wrote:
>>  From Paul Leach (my summary+extra info)
>> --------
>> Basic Authentication's username and password attributes are defined as
>> "*TEXT", Digest Authentication's username parameter is an qouted string
>> (essentially *TEXT) and passwd has no real definition, but probably 
>> *TEXT
>> or *OCTET.
>>
>> RFC 2616 does say that if a *TEXT word contains non-iso-8859-1 
>> characters
>> they should be represented using the RFC 2047 rules (e.g 
>> =?charset?Q?text?=
>> ).
>
> So 2617 already has a specified (if clunky and unevenly supported)
> solution for Basic.  I don't see sufficient reason to change it, since
> the worst thing about 2047 rules is that they are not human-friendly,
> but this is not for humans anyway.

I may be wrong, but as I mentioned, I've never seen a HTTP server or 
client use the RFC 2047 encoding, which leads me to believe it's never 
been implemented.

If so, there is no implemented means of telling the server which character 
set and encoding is used for the username and password. And AFAICT that is 
the present situation.

> I think that the character set isn't important - only the encoding;
> they both already (presumably) share the secret, they just have to
> agree on a common representation of it.

The server and client must *also* agree about the binary representation 
(character set and encoding) of the username, as the username is used as 
an index into the password database.

About Basic authentication: If RFC 2047 encoding is to be used preparing 
the username and password attributes (if they are not using the iso-8859-1 
character set) before creating the "basic-credentials" attribute, I think  
RFC 2617 should require it, even if it is supposed to follow from the 
defintion of *TEXT in sec. 2.2 of RFC 2616. Using RFC 2047 encoding does 
require that the server is able to map the characters to its own local 
representation, and require more processing of the credentials.

As mentioned earlier, my position is that I'd rather require UTF-8 
character set and encoding of the input (username and password) to the 
credential generation.

-- 
Sincerely,
Yngve N. Pettersen

********************************************************************
Senior Developer		             Email: yngve@opera.com
Opera Software ASA                   http://www.opera.com/
Phone:  +47 24 16 42 60              Fax:    +47 24 16 40 01
********************************************************************

Received on Wednesday, 26 November 2003 13:45:06 UTC