Re: Semantic meaning of double quotation marks delimiting quoted-string

On sön, 2007-10-28 at 21:39 +0000, Geoffrey Sneddon wrote:

> Which really is the question: what are we meant to do with the  
> delimiting quotation marks in quoted-string?

Whatever suits you best for processing the value I would say, as long as
the semantics of the grammar is preserved.

In case of entity-tag, the parser joining W/ and the dequoted etag value
as a single string is not a valid operation as it loses the distinction
between the weak marker and the etag opaqueue-tag value which are
distinct values in the grammar. If the parser removes quoting then the
weakness indication and the opaqueue-tag value needs to be stored
separately in order to be able to properly process the entity-tag. A
parser is not allowed to remove semantic structure of the parsed data,
only quoting/escaping of parsed elements, repeated LWS or other
redundant information having no semantic meaning.

> If we take UTF-8 as a string, we can escape this as a quoted-string  
> in several ways, including:
> - "UTF-8"
> - "\U\T\F\-\8"

Yes, for use within a quoted-string element only. See quoted-string for
the meaning of this.. and yes, the two are equal.

And no, neither Accept-Charset or Content-Type is using quoted-string
for the charset.. but yes, the general parser structure of Content-Type
allows for quoted-string in parameter values.

> Now, are we meant to unescape every quoted-string we come across  
> (therefore including entity-tag), or only some?

To compare two quoted-string elements you need to dequote them including
removing escapes, but in practice it doesn't matter much as people are
not usually escaping things within quoted-string unless needed (but
sometimes forget when needed, partly due to poor specifications, already

This is quite notable in for example Digest authentication where proper
handling of quoted-string is required for the hashes to compute properly
as they are based on the value as such and not the quoted-string
representation. (i.e a login name with " or \ in it..)

It's in theory also needed for ETag processing, but it's less noticeable
as impacts on the protocol of getting this wrong is pretty minimal.


Received on Monday, 29 October 2007 02:07:00 UTC