- From: David G. Durand <dgd@cs.bu.edu>
- Date: Wed, 8 Jan 1997 13:12:02 -0500
- To: w3c-sgml-wg@www10.w3.org
I spent quite a while rethinking everything to see if I am missing something, and I am still not convinced by Gavin's arguments. First I will try a coherent restatement of what I think I'm saying, in case the blow by blow has obscured things, and then I'll answer Gavin's points. I'd also like to say that I'm in fundamental sympathy with Gavin's approach, but I don't believe that it fits in with the standards environment we are operating in. 1. We need to address sub-parts of XML documents. I don't think there is any disagreement here. 2. URLs are intended to be _entirely_ opaque to clients (except for fragements identifiers -- "#-strings"). I don't agree that this is a necessary decision, but I have been told in other groups that the W3C will categorically oppose any standard that sanctions such "URL-hacking". If a standard format for version identification in URLs is not acceptable, I don't see why fragment identification is so fundamental that it would be acceptable. 3. Fragment IDs are interpreted _any way_ the client wants -- this means that if we want to do URL hacking, the fragment ID is available. So a client _is_ premitted to try constructing a URL based on the fragment ID. 4. Special support on servers (even CGI scripts) is frequently harder to deploy than special client support. Some sites are so large that running even the simplest CGI scripts is subject to extreme scrutiny. 5. If a server supports special URLs such as Gavin wants, it need not ever generate a fragment ID, but can simply generate a special URL. Since such a server has to parse the whole document anyway, this is even pretty easy to do. So a server that is smart is not forced to act dumb just because fragment IDs are (even a mandatory) part of XML -- it can simply use its own superior adressing features. 6. A client will need to be able to address subdocument parts anyway -- If linking to a 1-sentence anchor in a 2K document, it's _much_ easier to do local navigation than to bounce packets off a specialized server. So fragment IDs (or some logical equivalent will still be needed even if they are sometimes very sub-optimal (the 10MB document is an example). At 2:05 AM 1/8/97, Gavin Nicol wrote: >David Durand manipulated electrons to produce: >> If we want to put sub-resource (ie fragment of URL return value) >>addressing into XML, the only way that can be put into a URL is via the # >>string. This is nice because we have a way to point at IDs, (or arbitrary >>attribute values), or arbitrary XML-dependent substructures. > >Quite correct, though the emphasis is on _sub-resource_ addressing, >which is different to _resource_ addressing. sure... >>The kinds of feature that we are talking about (like TEI location >>ladders) will not be useful if they depend on special servers. > >That is quite open to debate. We can debate it all you want, but pragmatically, specialized servers are harder to deploy that specialized clients... I agree that in many cases (though not all) this is technically superior. But I just don't believe that such deployment will happen. >>I don't think XML addressing formats should require the use of a >>special server, which Gavin's proposal would require. > >My proposal would require, at a minimum, an XML processor capable of >parsing a well-formed instance, creating a tree from it, and then >traversing/querying the tree. This could easily be done as a CGI >script, and I think that writing the software required to do this >would add very little to the cost of implementing an XML processor. I >could certainly write it in 2 weeks, from scratch, in C/C++/Java. > >I object strenously to *requiring* that an entity be retrieved in it's >entirety in order to transclude a single element. I believe that this is not a requirement, see points 3, 5. >Points to remember: > > 1) You are talking about special code in the client, which would be > easily comparable to the complexity of the code in a server. Should be exactly the same, but we are putting the work on the client coder, who is presumably more committed to XML than a server coder, for whom XML is just another data format. I just read Paul Prescod's reply, so I will leave most of the rest of the points to his excellent commentary. >>On the other hand, a client could recognize that a particular >>#-string could be resolved by a particular server if it wanted, and >>translate the URL. > >I do not object to fragment specifiers, but this argument is >specious. You could just as easily say that a client could recognise >that it could retrieve the entire entity, and then walk it's own parse >tree based on the URL's I propose. This latter solution is explicitly denigrated by the W3C. I agree that it is technically feasible, but it violates an opaqueness condition stringently held by the HTTP and URL standards people. For a host of _almost purely nontechnical reasons_ I think that clients are the place for us to concentrate our efforts. >Again, I do not object to fragment specifier use, but I do object >to it being the only thing we can use. Special server-side URLs are always available, because a server can serve up anything it wants, under any URL it wants. But I don't think we can pretend to be able to _enforce_ server-based solutions on the web. >It does not scale. Worse, it >would preclude using XML with servers such as DynaWeb/DynaBase that >generate content dynamically, and may not even have the entity >structure left for you to address. As long as you address a well-formed fragment, this should not be a problem. Your server can certainly be smart enough to translate address formats, if it is already parsing the wqhole document. >I seriously hope your objection to "special servers" doesn't mean that >you think my motivation lies in the fact that I wrote DynaWeb, and >wish to promote it... my motivation lies in trying to avoid a solution >that doesn't scale well, and doesn't easily permit use of servers that >do not have XML files laying around on them (like RDB, etc). No, but I do think that life on the high end may make it more difficult appreciating the limitations imposed by most server admins. We have a lot of work to do just convincing browser manufacturers to use XML -- if we don't need to sign up for more salesmanship, then we shouldn't. -- David I am not a number. I am an undefined character. _________________________________________ David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com Boston University Computer Science \ Sr. Analyst http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams --------------------------------------------\ http://dynamicDiagrams.com/ MAPA: mapping for the WWW \__________________________
Received on Wednesday, 8 January 1997 13:05:03 UTC