- From: Yutaka OIWA <y.oiwa@aist.go.jp>
- Date: Tue, 06 Sep 2011 11:45:53 +0900
- To: HTTP Working Group <ietf-http-wg@w3.org>
Dear all, I have a question with the current definition of quoted-string in p1-16. In section 1.2.2, OWS is defined as *( [ obs-fold ] WSP ), allowing null-string match. The text in the same section specifies that non-null OWS can be transformed to single SP within field-content. At the same time, in section 3.2.3, quoted string is defined as DQUOTE *( qdtext / quoted-pair ) DQUOTE, where qdtext contains single OWS. (Note that quoted-string is to be used inside field-content.) This brings two unwanted consequences: 1) Using null-allowing OWS inside infinitely-repeating qdtext makes any quoted-strings to be parsed as infinitely many possibilities unneededly. 2) As a single OWS can eat two+ spaces at once, and as a side-effect of the OWS canonicalization in Sec. 1.2.2, continuous spaces in the quoted-string may be reduced to any (non-zero) number of spaces not more than the original. For example, five spaces may be parsed as (" ": OWS) (" ": OWS) (" ": OWS), which may be reduced to three spaces after "Sec 1.2.2 SHOULD rule" applied. This will bring unwanted ambiguity and bad interoperability especially with hash-or-crypto-based authentications. It also contradicts with what people thinks with "quoting". So, I propose change to the definition of qdtext in section 3.2.3 as follows: from qdtext = OWS / %x21 / %x23-5B / %x5D-7E / obs-text to qdtext = [ obs-fold ] WSP / %x21 / %x23-5B / %x5D-7E / obs-text (or add parentheses if clarification needed). As Section 3.2.1 separately specifies reduction of obs-fold to either one or two SPs, it still allows removal of line folding within quoted-string. I think the interoperability problem with line-folded quoted-string seems to be negligible (because line-folded quoted-string is a very bad thing anyway). P.S.1: The literal reading of "3.2.1 obs-fold rule" says that a line-folding "CR LF SP" should be reduced to either "SP SP" or "SP SP SP" (because CRLF is reduced to single SP). Is this correct and intended? I guess the intention of the first alternative is "SP" instead of "SP SP". As obs-fold is guaranteed to be followed by WSP, it can be simply removed instead of replacing to single SP. (This is a minor issue, however, because in many places it will be further reduced by OWS/RWS reduction rule in Sec 1.2.2.) # I personally prefer obs-fold to be defined as "CRLF 1*WSP" because # it clearly says that continued-line must be started with spaces, # but changing at this moment seems to be inadequate, # since adoption requires rewriting of many rules using obs-fold. P.S.2: The proposed change above affects handling of HT in a quoted-string. Is that better to be reduced to SP or to be kept as HT? -- Yutaka OIWA, Ph.D. Research Scientist Research Center for Information Security (RCIS) National Institute of Advanced Industrial Science and Technology (AIST) Mail addresses: <y.oiwa@aist.go.jp>, <yutaka@oiwa.jp> OpenPGP: id[995DD3E1] fp[3C21 17D0 D953 77D3 02D7 4FEC 4754 40C1 995D D3E1]
Received on Tuesday, 6 September 2011 02:46:27 UTC