RFC 2396 revision issue: Query definition

In section 3.4. RFC 2396 says: "The query component is a string of
information to be interpreted by the resource." If the resource is
identified before the query component is interpreted, why is the query a
part of the identifier? [1] I believe the RFC 2396 revision should
redefine the query component of the URI.

I found that Jim Whitehead had the same complaint on the definition four
years ago:

[[ This implies to me that if it is to be interpreted by the resource,
it cannot also be identifying that resource.  My rationale is the
resource needs to be identified first, before the query component can be
passed to it for interpretation, hence the query component cannot be
part of the resource identifier. ]] [2]

Larry Masinter replied:

[[ I can see now how you'd come to that conclusion; it does sound that
way. But I'll claim that we didn't MEAN IT. ]] [3]

More recent posts by Mark Nottingham:

[[ mailto allows you to specify a subject, body, etc. in the query
component, which is defined by 2396 as: "...a string of information to
be interpreted by the resource." Considering other uses of queries, this
seems to fit in nicely. ]] [4]

[[ This touches on something that's been on my mind for a while. If a
query is "a string of information to be interpreted by the resource,"
isn't it the case that a URI with a query refers to a resource, rather
than just identifies one? E.g., <http://www.example.com/foo?bar=baz> is
a reference to the resource <http://www.example.com/foo>. I.e.,
shouldn't the definition of URI-Reference (rather than URI) include not
only fragments, but also queries? ]] [5]

Reply by Martin Duerst: 

[[ Definitions are often chosen on their practical value, rather than on
philosophical considerations. In this case, the URI is what you (e.g.)
send to the server, the URI Reference is what you (e.g.) put into an
attribute. ]] [6]

My ideas on redefinition: query should be "identifying the resource
within the scope of that scheme and authority" just as the path is. The
difference between the components may be in ordering: while the path
segments must be in strict order (defining the path through a
hierarchy), query segments may be in arbitrary order, like "parameters"
or "switches". Information in query segments may also be optional and
generally more detailed than the path segments [1].

As for the troubling "mailto query", no such thing exists. The "mailto"
scheme doesn't comply with the "generic URI" syntax from the section 3
of the RFC 2396. The defining document, RFC 2368, in section 2 defines
"headers" with similar syntax but unrelated to RFC 2396 "query".

Hrvoje Simic
FER, University of Zagreb, Croatia
mailto:hrvoje.simic@fer.hr
mailto:hrvoje.simic@zg.hinet.hr



[1] http://www.tel.fer.hr/users/hsimic/cuc2002
[2]
http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0180.html
[3]
http://lists.w3.org/Archives/Public/w3c-dist-auth/1998OctDec/0201.html
[4] http://lists.w3.org/Archives/Public/uri/2002Apr/0010.html
[5] http://lists.w3.org/Archives/Public/uri/2002Apr/0011.html
[6] http://lists.w3.org/Archives/Public/uri/2002Apr/0014.html

Received on Wednesday, 13 November 2002 12:12:53 UTC