- From: Mark Nottingham <mnot@mnot.net>
- Date: Tue, 12 Jun 2007 22:43:25 +1000
- To: Andreas Maier <MAIERA@de.ibm.com>
- Cc: ietf-http-wg@w3.org, Brian Carpenter <brian.e.carpenter@gmail.com>, Larry Masinter <masinter@adobe.com>, David Singer <singer@almaden.ibm.com>
http://www.w3.org/Protocols/HTTP/1.1/rfc2616bis/issues/#i67 On 23/05/2007, at 9:41 PM, Andreas Maier wrote: > > > ... resending after getting subscribed to the list ... > > Hi, > In our group, we encountered a flaw in RFC2616 as part of > investigating an > interoperability problem between a client and a server component > that use > the CIM-XML over HTTP protocol. That protocol is used widely in the > industry for systems management and is owned by the DMTF standards org > (www.dmtf.org). > > I'd like to bring the flaw in RFC2616 to your attention, with the > goal of > folding this into a possibly upcoming update on that RFC (see > Larry's mail > below). I am not up to date as to where to report such things, so > you need > to help me a little to get this directed the right way (e.g. > towards the > activity Larry was mentioning). > > Here is a description of the flaw: > > RFC2616 defines the "charset" parameter of the "Content-Type" > header partly > using BNF, and partly in text. When following the BNF dependency > tree of > the "Content-Type" production (defined in section 14.17), it uses the > "media-type" production (defined in section 3.7), which uses the > "parameter" production (defined in section 3.6) which allows both an > unquoted "token" and a "quoted-string". So far, this allows both > quoted and > unquoted forms for the value of the "charset" parameter of "Content- > Type", > as in the following two examples (which are from how the CIM-XML > over HTTP > protocol uses Content-Type): > > Content-type: application/xml; charset="utf-8" > Content-type: application/xml; charset=utf-8 > > However, there is also the following text in section 3.4: > > --- begin of text --- > HTTP character sets are identified by case-insensitive tokens. The > complete set of tokens is defined by the IANA Character Set > registry > [19]. > > charset = token > > Although HTTP allows an arbitrary token to be used as a charset > value, any token that has a predefined value within the IANA > Character Set registry [19] MUST represent the character set > defined > by that registry. Applications SHOULD limit their use of character > sets to those defined by the IANA registry > --- end of text --- > > This text adds to the BNF defined syntax rules by defining a > recommendation > to use the IANA defined character sets in an unquoted form. > > The interoperability problem we encountered was caused by a silent > agreement in the CIM-XML community that the quoted form is to be used, > while the one CIM server that ran into the interoperability issue > obviously > has read RFC2616 better than the rest of the CIM-XML community and > required > the form without quotes. We plan to fix that in our CIM-XML spec by > recommending to use the unquoted form on the sending side, and to > support > both forms on the receiving side. > > Back to the flaw in RFC2616. The flaw is that the BNF definition of > the > "Content-Type" production does not utiilize the "charset" production > defined in section 3.4, and therefore an occasional reader of > RFC2616 who > follows the BNF dependencies, does not necessarily notice section > 3.4 and > hence arrives at the conclusion that the quoted and unquoted form > are both > equally ok. Which is what happened to me ;-) > > I suggest to fix this by utilizing the "charset" production > somewhere in > the "Content-Type" production. Maybe at the level of "media-type". In > addition, an explicit reference to section 3.4 could be added to the > description of Content-Type in section 14.17. > > I believe that specs like RFC2616 are not always read top to bottom > in one > flow, but are often used as a reference to answer particular > questions, and > this change would improve the capability of RFC2616 to allow for that. > > > Andy > > Andreas Maier > IBM Senior Technical Staff Member, Systems Management Architecture > & Design > IBM Development Laboratory Boeblingen, Germany > maiera@de.ibm.com, +49-7031-16-3654 > ______________________________________________________________________ > __________________________ > > IBM Deutschland Entwicklung GmbH; Geschaeftsfuehrung: Herbert Kircher; > Vorsitzender des Aufsichtsrats: Martin > Jetter, Sitz der Gesellschaft: Boeblingen, Registergericht: > Amtsgericht > Stuttgart, HRB 243294 > ----- Forwarded by Andreas Maier/Germany/IBM on 05/23/2007 11:29 ----- > > Brian > Carpenter/Switzer > land/ > IBM@IBMCH To > Andreas Maier/Germany/IBM@IBMDE > 05/22/2007 > 17:22 cc > > > Subject > Recondite HTTP question > > > > > > > > > > Andreas, > > I think you're correct, and the message below answers "is there any > place > to report this to ?" > > Regards, > > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > Brian E Carpenter > Distinguished Engineer, Internet Standards & Technology, IBM STG > Based in Switzerland, mobile phone +41 79 302 3262 > > <bcar@ch.ibm.com> for IBM business > <brian.e.carpenter@gmail.com> for IETF business > ----- Forwarded by Brian Carpenter/Switzerland/IBM on 2007-05-22 17:20 > ----- > > -------- Original Message -------- > Subject: RE: Recondite HTTP question > Date: Tue, 22 May 2007 06:18:12 -0700 > From: Larry Masinter <masinter@adobe.com> > To: Brian E Carpenter <brian.e.carpenter@gmail.com> > > There's a live effort to update the http spec and dea with issues like > this. How about bringing this up ietf-http-wg@w3.org? > > > > > > -----Original Message----- > From: Brian E Carpenter > [mailto:brian.e.carpenter@gmail.com] > Sent: Tuesday, May 22, 2007 12:14 AM Pacific Standard Time > To: Larry Masinter > Subject: Recondite HTTP question > > Larry, > > Question from a colleague: > > RFC 2616 seems to allow for both of these: > Content-type=application/xml; charset="utf-8" > Content-type=application/xml; charset=utf-8 > In your view, are both valid? A narrow interpretation suggests > that only the second one is formally bound to the IANA charset > registry. > > Thanks > > Brian > -- > NEW: Preferred email for non-IBM matters: brian.e.carpenter@gmail.com > > > -- Mark Nottingham http://www.mnot.net/
Received on Tuesday, 12 June 2007 12:43:35 UTC