Re: Link-3: Sets, Singletons, and Determinism

At 2:29 PM +0000 5/18/97, Peter Murray-Rust wrote:
>In message <3.0.32.19970518111704.00b18a30@pop.intergate.bc.ca> Tim Bray
>writes:
>My understanding of spans (or at least the '..' syntax) is that while
>it may return two *locations* it only returns one *element*.  Here is how
>JUMBO might interpret it:
>
><MOL ID="H2O2">
><ATOM>H</ATOM><ATOM>O</ATOM><ATOM>O</ATOM><ATOM>H</ATOM>
></MOL>
>
>'ID(H2O)CHILD(1,ATOM)..CHILD(3,ATOM)'
>
>would return
>(b)
><ATOM>H</ATOM><ATOM>O</ATOM><ATOM>O</ATOM>
>which might be valid, but is a set of elements from the original document.
>If this is what is required, direct querying would be preferable.

As I said in my response, I think it should return pointers to two
locations in the stream. Spans are only really useful if they can point to
arbitrary points in a document (i.e. not even well-formed element trees).
The mental model shoud be selections in an editor:

So if you had the following:
<MOL ID="H2O2">
<ATOM>H</ATOM><ATOM>O</ATOM><ATOM>O</ATOM><ATOM>H</ATOM>
</MOL>
<MOL ID="H2O">
<ATOM>H</ATOM><ATOM>O</ATOM><ATOM>H</ATOM>
</MOL>

and the span:
'ID(H2O2)CHILD(3,ATOM)..ID(H2O)CHILD(1,ATOM)'

you would be selecting a _region_ that overlaps two molecules.

To process it, you might ask to see the ESIS stream representing the events
between those two points. You would not get an element or a set of
elements, but instead something that might be repesented as 2 arrows
pointing at a tree, or the following string:

<ATOM>H</ATOM>
</MOL><MOL ID="H2O">
<ATOM>H</ATOM>


>'ID(H2O)..CHILD(3,ATOM)'
>
>is more problematic.  It could return
>
>(a) an error, since it could be interpreted as not being well formed.

Yes, there is no stuff linearly between a tree and some part of its contents.

>(b) a WF element created by unstacking the unbalanced closing tags:
><MOL ID="H2O2">
><ATOM>H</ATOM><ATOM>O</ATOM><ATOM>O</ATOM>
></MOL>
>This is dangerous as it creates a new apparently viable element (in the
>above case it's a chemical transformation, which clashes with the 'ID').

Yep... You shouldn't get a guarantee of elements from span.

>For this reason I have not implemented '..' because the syntax is not clear
>and potentially dangerous.

Seems clear to me. An endpoint should not be a sub-part of the starting
point. It just doesn't make sense. We probably do need to state this
explicitly.

>> If we are going to allow spans, and thus an xpointer to return N
>> locations, where N>1, should we consider saying that all xpointers
>> return sets of objects, and sometimes the size of the set is 1?  This
>> would open up a whole bunch of interesting apps.
>
>This is what JUMBO does at present.  (Note that the size of the set could
>be 0 if the search fails).
Yes, but it could be 1/2 if only a partial element is selected. Use of
spans to mark alternative markup will regularly create cases like this.
That's one of it's applications.

The problem is thinking that a span can be treated as an element, or
anything like one. It's not, it's two points, identifying a region of the
document of interest for some process.

>


  -- David

_________________________________________
David Durand              dgd@cs.bu.edu  \  david@dynamicDiagrams.com
Boston University Computer Science        \  Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/   \  Dynamic Diagrams
--------------------------------------------\  http://dynamicDiagrams.com/
MAPA: mapping for the WWW                    \__________________________

Received on Wednesday, 21 May 1997 14:05:03 UTC