- From: Yutaka OIWA <y.oiwa@aist.go.jp>
- Date: Tue, 06 Sep 2011 11:45:53 +0900
- To: HTTP Working Group <ietf-http-wg@w3.org>
Dear all,
I have a question with the current definition of quoted-string in p1-16.
In section 1.2.2, OWS is defined as *( [ obs-fold ] WSP ), allowing
null-string match. The text in the same section specifies that
non-null OWS can be transformed to single SP within field-content.
At the same time, in section 3.2.3, quoted string is defined as
DQUOTE *( qdtext / quoted-pair ) DQUOTE, where qdtext contains single OWS.
(Note that quoted-string is to be used inside field-content.)
This brings two unwanted consequences:
1) Using null-allowing OWS inside infinitely-repeating qdtext
makes any quoted-strings to be parsed as infinitely many
possibilities unneededly.
2) As a single OWS can eat two+ spaces at once,
and as a side-effect of the OWS canonicalization in Sec. 1.2.2,
continuous spaces in the quoted-string may be reduced to
any (non-zero) number of spaces not more than the original.
For example, five spaces may be parsed as
(" ": OWS) (" ": OWS) (" ": OWS), which may be reduced to
three spaces after "Sec 1.2.2 SHOULD rule" applied.
This will bring unwanted ambiguity and bad interoperability
especially with hash-or-crypto-based authentications.
It also contradicts with what people thinks with "quoting".
So, I propose change to the definition of qdtext in section 3.2.3 as follows:
from
qdtext = OWS / %x21 / %x23-5B / %x5D-7E / obs-text
to
qdtext = [ obs-fold ] WSP / %x21 / %x23-5B / %x5D-7E / obs-text
(or add parentheses if clarification needed).
As Section 3.2.1 separately specifies reduction of obs-fold to either
one or two SPs, it still allows removal of line folding within quoted-string.
I think the interoperability problem with line-folded quoted-string
seems to be negligible (because line-folded quoted-string is a very bad thing
anyway).
P.S.1:
The literal reading of "3.2.1 obs-fold rule" says that a line-folding
"CR LF SP" should be reduced to either "SP SP" or "SP SP SP"
(because CRLF is reduced to single SP). Is this correct and intended?
I guess the intention of the first alternative is "SP" instead of "SP SP".
As obs-fold is guaranteed to be followed by WSP, it can be simply removed
instead of replacing to single SP.
(This is a minor issue, however, because in many places it will be further
reduced by OWS/RWS reduction rule in Sec 1.2.2.)
# I personally prefer obs-fold to be defined as "CRLF 1*WSP" because
# it clearly says that continued-line must be started with spaces,
# but changing at this moment seems to be inadequate,
# since adoption requires rewriting of many rules using obs-fold.
P.S.2:
The proposed change above affects handling of HT in a quoted-string.
Is that better to be reduced to SP or to be kept as HT?
--
Yutaka OIWA, Ph.D. Research Scientist
Research Center for Information Security (RCIS)
National Institute of Advanced Industrial Science and Technology (AIST)
Mail addresses: <y.oiwa@aist.go.jp>, <yutaka@oiwa.jp>
OpenPGP: id[995DD3E1] fp[3C21 17D0 D953 77D3 02D7 4FEC 4754 40C1 995D D3E1]
Received on Tuesday, 6 September 2011 02:46:27 UTC