fragment exchange (was Re: rationales for TEI extended-pointer keywords)

> From: Peter Murray-Rust <Peter@ursus.demon.co.uk>

>... when I demonstrate this facility to people who 
> haven't seen XML before, and tell them it *comes free with the [draft] language*
> they are impressed.  It's a very strong selling point over other approaches.

I just conducted the technical half of a day-seminar on XML. They gave an ovation at
the end (because of excitement about XML, not my fascinating teaching postures :-)

They were most impressed with SHOW and ACTUATE.  I think their view was that these are
simple enough to do lots of wonderful things.  It would make sense, I guess, if XML 2.0 had
a link behaviour specification language, I think SHOW and ACTUATE provide nice, generic
defaults and they should *definitely* be kept in XML 1.0.

They asked about how what gets returned by a ID(x1)..ID(x2) when these ids of  element in 
different branches of the element tree: does just the text get returned or does a clipped tree 
get returned or what. If text is returned, is it XML.. I didn't know.  Any ideas yet?

One thing to do (I guess Elliot "Dr Fragment" Kimber is more on top of the issues) would be:

<?xml rmd="all" ?>
<!doctype document system [...]>
<document id="d1">
<foo>blah blah blah<bar id="x1"/> blah</foo>
<foo>blah blah blah<bar id="x2"/> blah</foo>
<document>

traversing a link with pointer ID(x1)..ID(x2) to would get a document like this:

<?xml rmd="internal"?>
<!doctype document system [...]>
<document xml-role="fragment" id="d1">
<foo><?xml xml-frag="start"?><bar id="x1"/> blah</foo>
<foo>blah blah blah<bar id="x2"/><?xml xml-frag="end"></foo>
</document>

I don't see how we can have .. between branches without also specifying some fragment
exchange conventions, since otherwise the XML document must be well-deformed and invalid.

Another approach would be an index into the actual character of the document (RS=nominal record
start =~ start of line). Lets assume that fragments are always generated by computer,
and that no transcoding goes on that changes the number of characters, and that any
header adjustments get done before the doctype.  The document could be pruned of
data outside the range, and any optional outside elements, if the
implementor felt like it, for efficiency reasons.

<?xml   rmd="all"?>
<?xml   frag-start="line doctype+20, char RS+4"    frag-end="line doctype+3, char RS+33" ?>
<!doctype document system [...]>
<document id="d1">
<foo>blah blah blah<bar id="x1"/> blah</foo>
<foo>blah blah blah<bar id="x2"/></foo>
<document>

If we don't define a fragment convention, I think xml-link spec will have to say that ranges 
that span branches are only possible on the client, i.e. with "#" not "?XML-LINK=".

> [Of course they ask about regexp :-)].

No. Mine didn't: they were commmercial HTML & SGML suppliers, from legal & CALS areas mainly, 
and they were more interested in users being able to access information only by the elements
the providers had marked up. They generally don't want users to play with the data.
 
Rick Jelliffe

Received on Thursday, 12 June 1997 23:53:43 UTC