Addressing.... from Gavin Nicol on 1997-01-08 (w3c-sgml-wg@w3.org from January 1997)

From: Gavin Nicol <gtn@ebt.com>
Date: Wed, 8 Jan 1997 02:05:17 -0500
To: dgd@cs.bu.edu
CC: w3c-sgml-wg@www10.w3.org
Message-Id: <199701080705.CAA04417@nathaniel.ebt>
David Durand manipulated electrons to produce:
>    If we want to put sub-resource (ie fragment of URL return value)
>addressing into XML, the only way that can be put into a URL is via the #
>string. This is nice because we have a way to point at IDs, (or arbitrary
>attribute values), or arbitrary XML-dependent substructures.

Quite correct, though the emphasis is on _sub-resource_ addressing,
which is different to _resource_ addressing.

>The kinds of feature that we are talking about (like TEI location
>ladders) will not be useful if they depend on special servers. 

That is quite open to debate.

>I don't think XML addressing formats should require the use of a
>special server,  which Gavin's proposal would require. 

My proposal would require, at a minimum, an XML processor capable of
parsing a well-formed instance, creating a tree from it, and then
traversing/querying the tree. This could easily be done as a CGI
script, and I think that writing the software required to do this
would add very little to the cost of implementing an XML processor. I
could certainly write it in 2 weeks, from scratch, in C/C++/Java.

I object strenously to *requiring* that an entity be retrieved in it's
entirety in order to transclude a single element. Points to remember:

  1) You are talking about special code in the client, which would be
     easily comparable to the complexity of the code in a server.
  2) Any instance/entity that is small enough to be transmitted across
     the internet, will not incur a great parse/traversal overhead on
     the server: certainly no greater than that required on the client
     side. 
  3) With most relative addressing schemes, only a given entity needs
     to be parsed, not the entire document.
  4) With the scheme I proposed, each URL is unique, and so can be
     cached effectively by caching proxies.
  5) The only scaleable solution is to do it server-side, or as
     distributed objects (basically the same thing, just different
     protocol and resolution mechanism).
  6) The mechanism I propose is easily applicable to domains other
     than XML.
 
>On the other hand, a client could recognize that a particular
>#-string could be resolved by a particular server if it wanted, and
>translate the URL. 

I do not object to fragment specifiers, but this argument is
specious. You could just as easily say that a client could recognise
that it could retrieve the entire entity, and then walk it's own parse
tree based on the URL's I propose. 

>So, _if_ we want a URL to be able to encode locations within documents,
>I think we have to use the #-string. 

I do not think this conclusion necessarily follows from the logic
outlined above.

Again, I do not object to fragment specifier use, but I do object
to it being the only thing we can use. It does not scale. Worse, it
would preclude using XML with servers such as DynaWeb/DynaBase that
generate content dynamically, and may not even have the entity
structure left for you to address. In this case, we would be forced to
send however MB the source is in it's entirety, or *fake* an entity
structure (easily done for DynaWeb, *much* harder for other types of
databases). 

I seriously hope your objection to "special servers" doesn't mean that
you think my motivation lies in the fact that I wrote DynaWeb, and
wish to promote it... my motivation lies in trying to avoid a solution
that doesn't scale well, and doesn't easily permit use of servers that
do not have XML files laying around on them (like RDB, etc).
Received on Wednesday, 8 January 1997 02:09:32 UTC