- From: Chris Newman <Chris.Newman@innosoft.com>
- Date: Mon, 28 Sep 1998 19:19:26 +0100 (BST)
- To: "Roy T. Fielding" <fielding@kiwi.ics.uci.edu>
- Cc: http-wg@hplb.hpl.hp.com
On Fri, 25 Sep 1998, Roy T. Fielding wrote: > Yes, they do. That doesn't change the definition of the protocol. > The username and password were defined as ISO-8859-1 when the > authentication fields were invented and deployed. Except for the > usual charset politics, that definition worked just fine. In RFC 2068, username and password are defined as TEXT, which may be either ISO-8859-1 or something encoded according to RFC 1522 (the old version of RFC 2047). I suspect it's quite clear that nothing other than ISO-8859-1 is going to work at all reliably in this context. Does anyone actually implement RFC 2047 in this context? RFC 2069 is a bit different, as the username appears in a quoted-string which therefore forbids the use of RFC 2047. So RFC 2069 requires ISO-8859-1 for usernames. Now this may work just fine for western Europe and America, but it is not international, so it is broken. When something is broken in a protocol, it should be fixed. Then one has to choose whether compatibility must be retained. I suspect that compatibiliy with RFC 2047 encoding in usernames and passwords is not worth retaining, as it was probably never signficantly deployed and even if it was, it probably doesn't interoperate. RFC 2047 is fine for most headers, but it was never designed for something which requires a canonical form. I have no opinion on whether compatibility with ISO 8859-1 compatibility should be retained in this context, as I'm not aware of the deployment patterns of that use -- that's a judgement call for this working group. If you want to make this international and retain ISO 8859-1 compatibility, then the right thing to do is use UTF-8 encoding, unless the entire string is made up of 8859-1 characters in which case 8859-1 encoding is used instead. Now since a draft standard can't reference UTF-8, you'd want to leave the encoding for non 8859-1 characters undefined for now, and define it in an extension, but it'd probably be worth forbidding all encodings other than 8859-1 unless specified in a standards track document -- that would reduce the problem. If you don't care about retaining compatibility with ISO 8859-1 use and want to make it international, then declare it US-ASCII for now and write an extension to make them UTF-8. The username parameter in digest auth is stuck at ISO 8859-1 by reference, but an encoding for UTF-8 could be added by an extension (e.g., RFC 2231 encoding). > In email protocols, specifications that contrast with reality have > traditionally been ignored by almost all developers and resulted in > interoperability failures when some poor sap actually attempted to comply > with the RFC. I disagree. Certainly the use of private agreement charsets is popular in email as it was the only localization solution avaiable prior to MIME and it was widely deployed. > HTTP does not allow that. HTTP has no more power to enforce the specification than email does. I will admit that an interactive protocol is easier to extend and upgrade than a store-and-forward protocol. > HTTP has a version number > whose minor number is supposed to change whenever compatible changes > are introduced, and a major number that is supposed to change whenever > incompatible changes are introduced. A new port number provides equivalent functionality to a major version number. Feature announcement provides superior functionality to minor version numbers. SMTP has been extensively modified without the need for version numbers. > I have no problem with defining a new protocol in the HTTP family that > cures the hundred-odd problems leftover from the installed base and > eventually progresses on the standards track. I have a huge problem > with such a protocol masquerading as HTTP/1.x when we have carefully > designed the protocol for forward compatibility. If you want 100% compatibility with the interoperable portions of the installed base, that's fine. But where the spec doesn't interoperate, it should be fixed. I suspect it doesn't interoperate for non-8859-1 characters in usernames and passwords. > The problem is that > the IETF standards-track process interferes with good protocol design > by not allowing progress along delineated branches. Quite the contrary. Version numbers prevent the development of branches. Feature announcement as ESMTP and IMAP use has been repeatedly successful in allowing multiple branches to develop simultaneously. > It is high time that the IETF started thinking in terms of protocol > families I see no problems in this area. Extensions to standard protocols are flourishing in the IETF. > and planning for evolution rather than making standards > decrees and hoping the installed base gets sucked into the void. Sometimes it is better to evolve. Sometimes it is better to start over and create an incompatible version. Sometimes the installed base sort of works but doesn't really interoperate and is best ignored when creating a fully interoperable solution. Which choice is better is an engineering decision which needs to be evaluated carefully. - Chris
Received on Monday, 28 September 1998 11:19:25 UTC