Re: ERB call on addressing from Michael Sperberg-McQueen on 1997-03-28 (w3c-sgml-wg@w3.org from March 1997)

From: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU>
Date: Thu, 27 Mar 97 19:09:47 CST
To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
Message-Id: <199703280148.UAA06240@www10.w3.org>
On Thu, 27 Mar 1997 19:39:05 -0500 Gavin Nicol said:
> ... Here, you are actually asking for a standard
>tree query and transformation language to be supported by
>all servers.

This view seems to come up frequently (sometimes in the formulation
"server owners think the syntax of the query segment belongs to
them, so we can't specify it"), and it makes no sense to me, so let
me ask the dumb question.  What do you mean?

If we specify a way of translating TEI extended pointer notation
into a URL, either into the query segment, or into the URL proper,
in what way are we saying this is something to be supported by all
servers?

Why aren't we saying "here is a language which, if the server
supports it, you can use, and which, if your users want it, your
server can be made to support"?

Suppose we were to say (this is not a proposal, though I wouldn't
mind if it made enough sense to become one -- actually, forms a,
b, and c below *are* the forms the ERB proposes to define, if I
understand our decision right).

1 An XML-Link locator can include a TEI Extended Pointer in any of
the following ways:

  a.  in the query section:
      http://www.uic.edu/x/y/z.xml?/tei/id(p23)child(1,emph)
  b.  in the fragment identifier the same way
      http://www.uic.edu/x/y/z.xml#/tei/id(p23)child(1,emph)
  c.  in the 'indeterminate form' this way
      http://www.uic.edu/x/y/z.xml�/tei/id(p23)child(1,emph)
  d.  in the URL-proper form this way
      http://www.uic.edu/x/y/z.xml/teiq/id(p23)/child(1,emph)

2 The query form and the fragment-identifier form are handled in the
customary way; the other two forms require special knowledge on the
part of the client and/or server, and negotiations outside the scope
of this spec:

  * the client sends the query form (a) to the server in
its entirety and gets back exactly what was pointed at(1);
  * for locators of form b, the client strips off the fragment
identifier, sends the part before the '#' to the server, and uses
the rest to navigate in the document sent back
  * when the indeterminate form c is used, the client and the server
negotiate using some method outside the scope of this standard
to decide exactly what the server returns and what the client must
do to it by way of navigation afterwards
  * when the URL-proper form d is used, something else happens ...

(1) a clever client could analyse the query and send part of it to
the server, retaining the rest to guide local navigation after
the document / document fragment is received.  Be careful, implementors:
don't leave yourself holding a query beginning ANCESTOR ...

---

If we said something like that, in what way would we be saying that
*all* servers *have* to support TEI queries?  A server that doesn't
support them would not respond usefully to queries other than
the fragment-identifier form.  Nu?  If I have documents there, I
won't use locators of forms a, c, or d -- any more than I can use
CGI-based URLs on a server where I don't have write permission in a
directory the server will execute CGI scripts from.

I'm also puzzled because I thought, some time ago, that Gavin was
arguing vociferously for a locator format that could, in principle,
be used to trigger server-side tree-traversal (there!  I managed to
avoid saying whether it is a query or an address!).  But you now
seem to be arguing against an ERB decision the main point of which
seems to me to be to support such a locator format.  What is wrong
with formats a, c, or d in this regard?

In the short run, I'd like to be able to exploit existing software as
much as possible; that means, I think, that I'd like to be able to
write locators using '#', for use with XML-savvy browsers, and
other locators using '?', for use when I know the server in question
is XML-savvy.  The '�' form ('�' is an excluded character, by the way;
are we sure we want to use it?) will be great, if and when there are
clients and servers who support XML-Link/TEI queries, and who are
willing to negotiate (in a format yet to be decided!) about who is
going to do what.  I'd rather not wait, however, for that to be
possible -- hence my preference for the ERB's suggestion of defining
all three forms.

I think Gavin has argued in the past for incorporating the whole
thing into the URL pathspec, using something not unlike the fourth
form, if I understood him correctly.  Two questions:  (1) is form d
something along the right lines?  and (2) what are the advantages and
disadvantages of a form like d (or of a form such as you propose, if
not d-like) vis-a-vis forms a, b, and c?

Seems to be my day for asking ignorant questions.

-C. M. Sperberg-McQueen
Received on Thursday, 27 March 1997 20:48:18 UTC