- From: Peter Murray-Rust <Peter@ursus.demon.co.uk>
- Date: Fri, 28 Mar 1997 11:59:18 GMT
- To: w3c-sgml-wg@w3.org
The ERB's proposal and the clarification in MS-M's posting meet my needs very well but I am somewhat confused by some other interpretations and concerned about the requirement for server participation. I hope that the simple comments below will represent how a large number of non-rocket XML-users would like the system to behave (at least in the first instance). One assumption seems to be that an XML file will always be mounted on a server, and that the server will be required to perform some XML-specific actions on it. This is not a requirement for much of what we wish to do - at the simplest level we wish to hold data in structured form. XML-LANG gives an application the chance to navigate this data in much better ways than previously. These documents are not always held on a server. In fact 5000+ CD-ROMs with 'XML' are being distributed in a few days by a major chemical publisher, along with an early version of JUMBO. Most of these CD-ROMs will never be mounted on a server and I hope that that doesn't disqualify the files from being called XML. (In fact we expect people to view them using Java-enabled Netsplorer and the JUMBO classes are available on the CDROM). The great advantage (to my mind) is that the *browser* provides much of the functionality that the server might otherwise provide. The URL for the documents has the protocol 'file:' (although of course they could also be mounted on a server with protocol http:). If we are saying that XML documents can ONLY have a protocol 'http:', then I have a problem. Our current model using MIME is often that the server's role is simply to stamp the file with the appropriate MIME type and that it is completely up to the browser/client what action it takes. So, for example, I could reasonably ask a webmaster to mount a set of CML files and to set *.cml to stamp them with text/xml. Anything further than that requires active participation of the webmaster who may not have the time/funding etc. (Note: we set high value on cooperative, unchargeable, services like mounting local mirrors of (static) resources. It is unreasonable to expect webmasters to maintain dynamic resources for free). If I ask a webmaster to set up something of the sort: http://www.foo.ac.uk/bar/blort?impenetrability this implies some real-life negotiation between me and the W/M and probably a CGI script in the first instance. The '?' implies work for the W/M. If I ask for an address of the form: http://www.foo.ac.uk/bar/blort.html#impenetrability the webmaster's responsibility is simply to locate a given file (blort.html), stamp it with text/html and send it back. It's up to my browser what it does. (Indeed different *browsers* may do different things if there are zero or multiple occurrences of 'impenetrability' in the file. In the first instance I am sure that we shall be doing client-side navigation within documents. Anything else requires writing server-side code to do it. (Since XML is not yet a de facto standard that will take a year or two to get accepted and for support beyond MIME-stamps to become acceptable.) At present the only client-side navigation is through html:HREF. XML gives potentially much greater power and CML is constructed on the basis that navigation systems are essential. (The DTD is very flexible, and the precise architecture of any document is unlikely to be known beforehand, since it could be converted from any of 20 different legacy types.) A typical question is 'retrieve all the molecules from this document'. A year ago I was using CoST, but have now added TEI syntax to my application (JUMBO). In TEI this is simply: DESCENDANT (ALL MOL) So whilst I assume that file:/foo/bar.xml#DESCENDANT(ALL,MOL) is presumably outside the XML remit, I shall provide that functionality locally. The key point of this is that the client may not know in advance what facilities a server has for XML. (This can arise when a set of 'static' documents is mirrored somewhere. The relative addresses are all correct, but the server may lack some functionality). One option has to be that the client can retrieve the whole document and process it itself. In message <199703280148.UAA06240@www10.w3.org> Michael Sperberg-McQueen writes: > On Thu, 27 Mar 1997 19:39:05 -0500 Gavin Nicol said: > > ... Here, you are actually asking for a standard > >tree query and transformation language to be supported by > >all servers. > > This view seems to come up frequently (sometimes in the formulation > "server owners think the syntax of the query segment belongs to > them, so we can't specify it"), and it makes no sense to me, so let > me ask the dumb question. What do you mean? > > If we specify a way of translating TEI extended pointer notation > into a URL, either into the query segment, or into the URL proper, > in what way are we saying this is something to be supported by all > servers? > > Why aren't we saying "here is a language which, if the server > supports it, you can use, and which, if your users want it, your > server can be made to support"? This is my view, though not all webmasters can be 'made' to do things :-) > > Suppose we were to say (this is not a proposal, though I wouldn't > mind if it made enough sense to become one -- actually, forms a, > b, and c below *are* the forms the ERB proposes to define, if I > understand our decision right). > > 1 An XML-Link locator can include a TEI Extended Pointer in any of > the following ways: > > a. in the query section: > http://www.uic.edu/x/y/z.xml?/tei/id(p23)child(1,emph) > b. in the fragment identifier the same way > http://www.uic.edu/x/y/z.xml#/tei/id(p23)child(1,emph) > c. in the 'indeterminate form' this way > http://www.uic.edu/x/y/z.xml/tei/id(p23)child(1,emph) > d. in the URL-proper form this way > http://www.uic.edu/x/y/z.xml/teiq/id(p23)/child(1,emph) I don't see the difference between c and d unless tei and/or teiq are reserved words in a URL. Could this be expanded, please? (BTW - if you are suggesting commas as a replecment for spaces in the Xptr that's fine by me) > > 2 The query form and the fragment-identifier form are handled in the > customary way; the other two forms require special knowledge on the > part of the client and/or server, and negotiations outside the scope > of this spec: > > * the client sends the query form (a) to the server in > its entirety and gets back exactly what was pointed at(1); This seems straightforward. The client does not know what magic the server uses to locate the fragment/resource, but the result would have been the same as if the client had had the document locally and applied the Xptr. (We assume that conceptually there is a document 'z.xml') > * for locators of form b, the client strips off the fragment > identifier, sends the part before the '#' to the server, and uses > the rest to navigate in the document sent back Precisely. > * when the indeterminate form c is used, the client and the server > negotiate using some method outside the scope of this standard > to decide exactly what the server returns and what the client must > do to it by way of navigation afterwards This could handle the indeterminate form above: either the server returns the fragment or it returns the whole document with some indication that it's up to the client... > * when the URL-proper form d is used, something else happens ... > > (1) a clever client could analyse the query and send part of it to > the server, retaining the rest to guide local navigation after > the document / document fragment is received. Be careful, implementors: > don't leave yourself holding a query beginning ANCESTOR ... ^^^^^^^^^^^^^^^^^^ (? or even *containing* ANCESTOR or PRECEDING? I had interpreted TEI to mean that it is possible to end up higher up the tree than where you start.) This is very attractive and I am sure we would use it a lot. > Seems to be my day for asking ignorant questions. I'm in good company then... :-) P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/
Received on Friday, 28 March 1997 07:41:46 UTC