W3C home > Mailing lists > Public > w3c-sgml-wg@w3.org > April 1997

Re: Another pseudo-element gotcha

From: Peter Murray-Rust <Peter@ursus.demon.co.uk>
Date: Mon, 28 Apr 1997 16:47:11 GMT
Message-Id: <6013@ursus.demon.co.uk>
To: w3c-sgml-wg@w3.org
In message <2.2.32.19970428114343.00eb85ec@jclark.com> James Clark writes:
[...]
> 
> It sounds like you're using xpointers as a general-purpose processing tool

Yes I am.  They're great!  Much better than relational databases.  But this 
use is just a free byproduct.

> not just as a syntax for fragment specifiers in XML simple links and
> extended links.  The purpose of XML xpointers is the latter not the former.

Agreed, and I'll stick with that.

> For the former all the features that I've suggested removing are useful --
> in fact I think you need many other additional features for general
> processing -- but for the latter I'm unconvinced.  Can you give real
> examples of why these features are needed in the context of fragment
> specifiers in XML xpointers in simple links and extended links in XML documents?

Yes :-).  They may seem over complicated, but having been sold (by the SGML
community) on the virtues of processing structured documents, it's possible
to create documents which take advantage of that flexibility.  So I 
shall be suggesting to the molecular community that we use multi-component
documents.  

If you think of a scientific publication, you can think of it having 
components (graphs, tables, molecules, figures, citations, etc) and all
of these will vary from publisher to publisher in order and amount.
It's quite meaningful and possible to ask:
	'please extract all molecules from this paper'
	'extract all molecules which contain spectra'
(notice the word 'all').  In general we shall not know how many molecules
there are in the paper or what order they come in.
> 
> In fact, the more I think about it, the more xpointers seems hugely
> over-complex for use in simple and extended links. Any element can be
> addressed using an xpointer of the form
> 
> ROOT,CHILD(17,A)(21,B)

This assumes that you know that you need to count 17 along.  When you
have no idea of the structure of the document, other than the components
it may contain (but not the level they may occur at or their order), the
tools that are currently suggested are ideal.

> 
> or
> 
> ID(FOO)CHILD(17,A)(21,B)
> 
> and such xpointers seem as robust as any.  Why do we need more than this?  I
> can maybe see the need for HERE and ANCESTOR for links relative to the
> current node, and maybe for DESCENDANT, but I'm having a hard time seeing
> why PRECEDING, FOLLOWING, NEXT, PREVIOUS are part of the "minimum required
> to declare victory".

Sequential information can be very important in technical subjects so that
it may make sense to ask questions like
	'please find the measurement which refers to an injection, and return
the second measurement after that'.   

Without the last four keywords I can't see how you can refer to a sibling 
element (if you don't know numbers beforehand).

I hope this doesn't mean that 'ALL' is a poor idea :-)  If a document contains
an unknown number of instances of items you want, and if these have no
unique IDs (highly probable), I can't see how you can retrieve them
all whether in order or not.  When XML starts to be used for storing
LINKs, the sort of question a robot will be asked is:
	'retrieve all the links in this document and... do something'.
Without 'ALL', the robot will have to keep a counter and resubmit incremental
queries

int link = 0;
while (true) {
    Element e = 
        Robot.getElement("ROOT,DESCENDANT("+(++link)+",*,XML-LINK,*)");
    if (e == null) break;
    Robot.getLink(e);
}

and this will be messy and occur in lots of places throughout the code.

	P.

-- 
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
Received on Monday, 28 April 1997 12:17:22 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 24 September 2003 10:04:25 EDT