Re: RFC 2396 revision issue: Query definition from Al Gilman on 2002-11-17 (uri@w3.org from November 2002)

From: Al Gilman <asgilman@iamdigex.net>
Date: Sun, 17 Nov 2002 11:17:35 -0500
To: Graham Klyne <GK@ninebynine.org>
Cc: "Mark Nottingham" <mnot@mnot.net>, "Hrvoje Simic" <hrvoje.simic@zg.hinet.hr>, <uri@w3.org>
Message-Id: <5.1.0.14.2.20021114115317.02952c60@pop.iamdigex.net>
At 11:33 AM 2002-11-14, Graham Klyne wrote:
>Al,
>
>I've no argument with what you say.
>
>The URI spec rightly deals with the common ground between client and 
>server, and in that there is no need for issues of "URIs are a language 
>for building service-variant key expressions" -- you said it:  these are 
>service variant.  Also, there's no need to deal with shades of requesters 
>intent "to discover resources by their properties than to recover a 
>resource by name".
>
>I'm not denying the existence of these other issues, just saying they're 
>not needed as a defined part of the protocol between client and server, so 
>to make their semantics part of the URI spec is unnecessary complexity.

Perhaps unnecessary, but not necessarily not beneficial.  All the common
syntax is there as an optimization.  And it was at least in the last pass
decided to put it in there.

How shall I say?  As these functions are widely used at present, it is not
clear that URI semantics without them is sufficient to sustain the
popularity of URIs as the expression of connection between distinct
transactions.

The fact that the public recognizes the pattern www.YourBusinessName.com and
can recall instances in that pattern is part of the enterprise capital of
the Web.  We need to study how this managed to take root in the collective
vernacular, and build on this as a strength, not spurn it as unnecessary.

Similarly, I let this message rot in my drafts heap for a while because I
feared I am just pushing my pet theories.  Then I realized that it matters
to the clients I represent in the WAI, specifically those who don't have the
advantage of a many-bit persistent buffer of display in their experience of
the web.

Webmasters use hierarchies indexed by their path segments to convey a
topical concept tree for the resources in their domain.  This is very
important to blind web users and others who don't get a whole lot of the
page presented to them at once.  The idea that users are only following
hyperlinks built into the page, and not steering by a concept of a topic
tree clearly breaks down in this case.  And teh blind visitor is just acting
as the canary in the mineshaft in this case.  It is a bad idea in general,
that reaches a failure threshold sooner in the reduced context of the screen
reader user.

Part of what I have to campaign webmasters to do is precisely the above, to
make their whole site work as much like an ANSI/NISO Z39.86 digital talking
book as they can, with regard to hierarchical topical organization and
navigation through this conceptual framework.

So hierarchical indexing within edited corpora, that is to say the corpus of
stuff that is maintained and published in a domain by a common volitional
entity (person, corporation, or even a non-legal entity such as the W3C that
intends to tell a coherent story) is a asset, a quality that the Web should
systematically provide tools for.

Hierarchical indices and tuple-space queriable models are commodity classes
of information request, and the institutionalization of these in the URI
language is a strong play by the Web and should be spurned only on very
strong reason.

It is possible to reduce the URI language to a pure point space with a
completely trivial topology.  But the Web loses when we do this.

Al

-- notes from earlier...

Also, you are only looking at the machine processing of URIs.  URIs are
handled by both people and machines, and optimizing their effectiveness in
both domains is worth paying attention to.  Even where they are processed by
machines, the code that constructs the query and the code that responds to
the query have to be writeable by different people in different
organizations or we are not providing the interoperation platform we thought
we were.  So the effectiveness of the URI class-action contract [inter-tier
contract] as harmonizing and normalizing a query language should be on the
table for consideration.

I am afraid at this point we have clearly exposed a fundamental difference
in meta-architectural premises.

You presume that an interface specification should define one level of
interpretation as narrowly as possible and get off.

 From my experience with open systems for software, I believe that the
thinness of the interface specifications has been the recurring failure mode
of attempts to create open frameworks by interface specifications.  In my
model of workable open technology, each interface specification has to reach
into either side of the interface and yoke the processing on the two sides
by some mutual constraints on how the messages crossing the frontier are
processed (here vs. there).  And that this means binding a fixed symbol form
at the point of passage to *a range of* levels of understanding.  Not one.
The optimization criterion is right-sizing the swath that this cuts into the
two sides of the dotten line.  Not minimizing it.

So it's important for the market penetration and sustained use of URIs that
they not only separate processing into two worlds, but that they also
connect the processing in material ways.  In the case of URIs, the "Dublin
Core" of semantics that people are re-using across the preponderance of
sites include access to hierarchy through the path part and access to tuple
spaces through the query part.  Cleaning these up as semantic utilities for
service providers is appropriate sustaining engineering for the URI class of
utterances.

If the HTTP and HTML Web is not simply to become the virtual terminal device
of the SOAP internet, we need to invest in the web service description
functionality of URIs and not act as though it is extraneous.

Al

>#g
>--
>
>At 11:19 AM 11/14/02 -0500, Al Gilman wrote:
>
>>At 09:25 AM 2002-11-14, Graham Klyne wrote:
>>
>>>I assumed that the query was semantically indistinguishable from the 
>>>rest of the path -- part of what Hrvroje said? -- how they are 
>>>interpreted is entirely up to the software the provides access to 
>>>resources for the indicated authority.  Anything more, I think, is just 
>>>a convenient convention, and not bound to anything fundamental in the 
>>>nature of form of URI.
>>>
>>>E.g. I think a web server could legitimately serve *files* with names 
>>>(and corresponding URIs) that just happened to contain substrings that 
>>>look like URI query elements.
>>>
>>>Hence, in the general case, reordering of queries must be regarded as 
>>>significant (i.e. different URIs), even though some severs may choose to 
>>>return values that are invariant under reordering of same.
>>
>>We have to look at URIs from two sides:
>>
>>On the client side, it is a nearly-opaque key string.  It has articulable
>>parts, at least up to scheme + everythingButTheScheme.  The usage of the
>>?query part syntax is consistent [across both http: and mailto: schemes]
>>in creating another consistently modal and modally consistent zone in the 
>>URI.
>>
>>On the server side, however, URIs are a language for building service-variant
>>key expressions.  Our URI thingie has to work as a language for this
>>purpose.  In a sense they're all expressions in a query language.
>>
>> From the server side view, at least, it is important to note the following
>>difference between path segment parameters and the searchPart following the
>>'?'.   A path segment parameter is an aside; you can after this assert 
>>another
>>parameter or come back and assert another path segment.  When you hit the
>>'?' you know that there are no more path segments to come.  You can close
>>positional association and be assured of named association for the factoids
>>that follow.
>>
>><end of that thread/>
>>
>>At a yet higher level, it should be realized that from a semantic
>>perspective, when one uses a search URL it is more to discover resources by
>>their properties than to recover a resource by name.
>>
>>The properties used here are colloquially-based, not controlled by the
>>creator of the units of resource discovered.
>>
>>The sense of a query is invariant, regardless of whether it is presented to
>>Alta Vista or Excite.  Or at least more same than different.  Different
>>services may satisfy the query better and worse, but the question is
>>essentially the same.
>>
>>Al
>>
>>>#g
>>>--
>>>
>>>At 11:34 AM 11/13/02 -0800, Mark Nottingham wrote:
>>>
>>>>It would be nice if this could be addressed in some fashion. Thanks for
>>>>bringing it up, and summarizing the history so nicely (I wasn't aware of
>>>>the previous discussion).
>>>>
>>>>
>>>> > My ideas on redefinition: query should be "identifying the resource
>>>> > within the scope of that scheme and authority" just as the path is. The
>>>> > difference between the components may be in ordering: while the path
>>>> > segments must be in strict order (defining the path through a
>>>> > hierarchy), query segments may be in arbitrary order, like "parameters"
>>>> > or "switches". Information in query segments may also be optional and
>>>> > generally more detailed than the path segments [1].
>>>>
>>>>Those feel like guidelines more than hard semantics; IIRC, the main
>>>>distinction between URI path segments and URI parameters is that
>>>>parameters aren't ordered, so that aspect doesn't distinguish queries.
>>>>
>>>>Perhaps what does distinguish queries is that while they are used in
>>>>identifying the resource, they aren't used directly in
>>>>locating/dereferencing it; just as fragment identifier semantics are
>>>>interpreted on the client side in the scope of the resource's
>>>>representation, so queries are interpreted on the server side in the scope
>>>>of the located resource (which may be a new concept).
>>>>
>>>>I haven't worked through the ramifications of this; just food for thought.
>>>
>>>-------------------
>>>Graham Klyne
>>><GK@NineByNine.org>
>>
>>-------------------
>>Graham Klyne
>><GK@NineByNine.org>
Received on Sunday, 17 November 2002 11:18:32 UTC