- From: Peter Murray-Rust <Peter@ursus.demon.co.uk>
- Date: Mon, 28 Apr 1997 16:47:11 GMT
- To: w3c-sgml-wg@w3.org
In message <2.2.32.19970428114343.00eb85ec@jclark.com> James Clark writes: [...] > > It sounds like you're using xpointers as a general-purpose processing tool Yes I am. They're great! Much better than relational databases. But this use is just a free byproduct. > not just as a syntax for fragment specifiers in XML simple links and > extended links. The purpose of XML xpointers is the latter not the former. Agreed, and I'll stick with that. > For the former all the features that I've suggested removing are useful -- > in fact I think you need many other additional features for general > processing -- but for the latter I'm unconvinced. Can you give real > examples of why these features are needed in the context of fragment > specifiers in XML xpointers in simple links and extended links in XML documents? Yes :-). They may seem over complicated, but having been sold (by the SGML community) on the virtues of processing structured documents, it's possible to create documents which take advantage of that flexibility. So I shall be suggesting to the molecular community that we use multi-component documents. If you think of a scientific publication, you can think of it having components (graphs, tables, molecules, figures, citations, etc) and all of these will vary from publisher to publisher in order and amount. It's quite meaningful and possible to ask: 'please extract all molecules from this paper' 'extract all molecules which contain spectra' (notice the word 'all'). In general we shall not know how many molecules there are in the paper or what order they come in. > > In fact, the more I think about it, the more xpointers seems hugely > over-complex for use in simple and extended links. Any element can be > addressed using an xpointer of the form > > ROOT,CHILD(17,A)(21,B) This assumes that you know that you need to count 17 along. When you have no idea of the structure of the document, other than the components it may contain (but not the level they may occur at or their order), the tools that are currently suggested are ideal. > > or > > ID(FOO)CHILD(17,A)(21,B) > > and such xpointers seem as robust as any. Why do we need more than this? I > can maybe see the need for HERE and ANCESTOR for links relative to the > current node, and maybe for DESCENDANT, but I'm having a hard time seeing > why PRECEDING, FOLLOWING, NEXT, PREVIOUS are part of the "minimum required > to declare victory". Sequential information can be very important in technical subjects so that it may make sense to ask questions like 'please find the measurement which refers to an injection, and return the second measurement after that'. Without the last four keywords I can't see how you can refer to a sibling element (if you don't know numbers beforehand). I hope this doesn't mean that 'ALL' is a poor idea :-) If a document contains an unknown number of instances of items you want, and if these have no unique IDs (highly probable), I can't see how you can retrieve them all whether in order or not. When XML starts to be used for storing LINKs, the sort of question a robot will be asked is: 'retrieve all the links in this document and... do something'. Without 'ALL', the robot will have to keep a counter and resubmit incremental queries int link = 0; while (true) { Element e = Robot.getElement("ROOT,DESCENDANT("+(++link)+",*,XML-LINK,*)"); if (e == null) break; Robot.getLink(e); } and this will be messy and occur in lots of places throughout the code. P. -- Peter Murray-Rust, domestic net connection Virtual School of Molecular Sciences http://www.vsms.nottingham.ac.uk/
Received on Monday, 28 April 1997 12:17:22 UTC