Re: Semantic meaning of double quotation marks delimiting quoted-string from Henrik Nordstrom on 2007-10-29 (ietf-http-wg@w3.org from October to December 2007)

From: Henrik Nordstrom <henrik@henriknordstrom.net>
Date: Mon, 29 Oct 2007 03:06:42 +0100
To: Geoffrey Sneddon <foolistbar@googlemail.com>
Cc: Julian Reschke <julian.reschke@gmx.de>, ietf-http-wg@w3.org
Message-Id: <1193623602.4150.108.camel@henriknordstrom.net>

On sön, 2007-10-28 at 21:39 +0000, Geoffrey Sneddon wrote:

> Which really is the question: what are we meant to do with the  
> delimiting quotation marks in quoted-string?

Whatever suits you best for processing the value I would say, as long as
the semantics of the grammar is preserved.

In case of entity-tag, the parser joining W/ and the dequoted etag value
as a single string is not a valid operation as it loses the distinction
between the weak marker and the etag opaqueue-tag value which are
distinct values in the grammar. If the parser removes quoting then the
weakness indication and the opaqueue-tag value needs to be stored
separately in order to be able to properly process the entity-tag. A
parser is not allowed to remove semantic structure of the parsed data,
only quoting/escaping of parsed elements, repeated LWS or other
redundant information having no semantic meaning.

> If we take UTF-8 as a string, we can escape this as a quoted-string  
> in several ways, including:
> 
> - "UTF-8"
> - "\U\T\F\-\8"

Yes, for use within a quoted-string element only. See quoted-string for
the meaning of this.. and yes, the two are equal.

And no, neither Accept-Charset or Content-Type is using quoted-string
for the charset.. but yes, the general parser structure of Content-Type
allows for quoted-string in parameter values.

> Now, are we meant to unescape every quoted-string we come across  
> (therefore including entity-tag), or only some?

To compare two quoted-string elements you need to dequote them including
removing escapes, but in practice it doesn't matter much as people are
not usually escaping things within quoted-string unless needed (but
sometimes forget when needed, partly due to poor specifications, already
fixed).

This is quite notable in for example Digest authentication where proper
handling of quoted-string is required for the hashes to compute properly
as they are based on the value as such and not the quoted-string
representation. (i.e a login name with " or \ in it..)

It's in theory also needed for ETag processing, but it's less noticeable
as impacts on the protocol of getting this wrong is pretty minimal.

Regards
Henrik

Received on Monday, 29 October 2007 02:07:00 UTC