- From: Brian Smith <brian@briansmith.org>
- Date: Fri, 14 Mar 2008 05:31:22 -0700
- To: "'HTTP Working Group'" <ietf-http-wg@w3.org>
Julian Reschke wrote: > Brian Smith wrote: > > ... > > It is not clear whether or not the RFC 2047 mechanism can > be used in > > quoted-string, because quoted-string is not defined in terms of > > "*TEXT", but rather a similar construct. Given all the places that > > quoted-string > > ??? > > <http://greenbytes.de/tech/webdav/rfc2616.html#basic.rules.quo > ted-string>: > > quoted-string = ( <"> *(qdtext | quoted-pair ) <"> ) > qdtext = <any TEXT except <">> <any TEXT except <">> is not equivalent to *TEXT. > I think this is the intent. Then you run into the question "How are media-ranges and media-types compared? Are they to be decoded into Unicode and then compared?" When the specification specifies that ETags must match exactly, is the comparison character-by-character or octet-by-octet? > > Also, the Reason-phrase of the status line is defined as: > > > > *<TEXT, excluding CR, LF> > > > > But, is the RFC 2047 mechanism allowed in the Reason-phrase? > > I would think so. Again, the grammar for reason-phrase is not *TEXT, that is why it isn't clear > > And, if it is read liberally, then it is > > I disagree. > > > allowed in way too many places. And, if it is allowed > > anywhere, there should be some advice as to what > > encodings should be supported. > > From the headers above, where do you think it shouldn't be allowed? Consider: Content-Type: text/plain;charset="=?utf-8?q?utf-8?=" (how do you compare this against 'text/plain;charset="utf-8"'?) ETag: "=?utf-8?q?asdf?=" (how do you compare this against "asdf"?) ETag: "=?" (Is this a lexical error?) > I do agree that if we rely on RFC2047, we may also have to > spend some time improving that document. Keep in mind that RFC2047 has a limit of 75 characters per encoded-word. And, the grammar seems to allow encoded-words to be mixed with unencoded words. And, Base-64 encoding to be muxed with quotable-printable. And, multiple encodings (e.g. UTF-8 and UTF-7) to be mixed. All in the same *TEXT segment. There is definitely a lot to be improved, but each improvement would be a incompatible change. - Brian
Received on Friday, 14 March 2008 12:31:58 UTC