Fwd: XPointer test suites from Eve L. Maler on 2000-10-09 (www-xml-linking-comments@w3.org from October to December 2000)

From: Eve L. Maler <eve.maler@east.sun.com>
Date: Mon, 09 Oct 2000 08:45:22 -0400
To: www-xml-linking-comments@w3.org
Message-Id: <5.0.0.25.2.20001009084405.02aa0130@abnaki.east.sun.com>
Michael Dyck asked for this thread to be posted to the comments list, so he 
could continue it.  Thoughts from other commenters are, of course, welcome too.

         Eve

>Date: Wed, 04 Oct 2000 18:06:26 -0400
>To: w3c-xml-linking-ig@w3.org
>From: "Eve L. Maler" <eve.maler@East.Sun.COM>
>Cc: Michael Dyck <MichaelDyck@home.com>
>Subject: XPointer test suites
>
>Steve DeRose and Michael Dyck recently had a very productive email 
>exchange about XPointer testing, and I wanted to bring it to people's 
>attention.  (I've gotten permission from Michael to post it here; I 
>haven't heard from Steve yet, but I'm quite sure it will be okay since he 
>started copying me on the thread in order to make sure it got used somewhere!)
>
>Because the relevant messages were sequenced funny due to some email 
>header problems on Steve's end, I've just assembled a facsimile of the 
>thread here, and I've also tried to avoid duplication.  I wasn't made 
>privy to the very first message, so what you see starts out with Steve 
>replying to Michael.
>
>Your thoughts, test suite contributions, and offers to build or maintain a 
>comprehensive suite are welcome. :-)
>
>         Eve
>
>                         *               *               *
>
>[Steve DeRose responding to Michael Dyck:]
> >Steve DeRose wrote:
> >>
> >> To go to PR you have to fulfill a set of exit
> >> requirements; we're trying to verify them now, so hopefully soon. The
> >> biggest factor is showing complete or near-complete implementations; some
> >> places don't want to talk about theirs....
> >
> >How do you know if an implementation is (near-)complete?  Is there a test
> >suite?
>
>Yes, that's just the snag; we haven't built one yet. If it works on all the
>examples in the spec, that's considered pretty good evidence; there's also
>a nice set of examples in XPath, most of which would also apply (all,
>actually, unless there are some examples using extension functions,
>variables, etc).
>
>I think there's also a specific requirement for testing at least one
>XPointer that returns a discontiguous set of ranges... Something like
>
>    id("foo1 bar1")/range-to(id("foo2 bar2"))
>or
>    //footnote/range-to(following::p)
>
>I'd like to also see an overlapping case (remember range-to is define as going
>from the start of the context, to the end of the result -- so this would be
>from the start of chapter 1 to the end of its third p.
>    id("chap1")/rangto(/p[3])
>
>Other cases that pop to mind (I got motivated, so am making slight progress
>toward a test suite):
>
>Multiple IDs:
>    id("foo bar baz")
>
>Multiples with further steps in simple lockstep (3 results):
>    id("foo bar baz")/p[2]
>
>Same but where some random one drops out, like if bar has only 3 p children:
>    id("foo bar baz")/p[4]
>
>Multiples with further steps that expand the result set (here assuming at 
>least
>one of foo/bar/baz contains >1 p child):
>    id("foo bar baz")/p
>
>At least one test of each axis that succeeds, and one that fails;
>preferably a non-trivial one.
>
>Ideally, at least one test of each ordered pair of axes:
>    id("foo")/following[10])
>    id("foo")/preceding[1])
>
>...for all combinations
>
>Not all are allowed (like /attribute("type")/namespace("bar")...), but they
>should still appear so error-catching is tested too.
>
>A thorough test could try that same set of pairs several times, once with
>each rigged to fail, once with each returning a singleton, and once with
>each returning multiples.
>
>There should be a test for redundancy-removal, making sure that this
>doesn't return extra copies:
>    /p | /p
>
>A test should include returning points, nodes and ranges.
>
>A test should include trying out each function and operator.
>
>A lot of these, of course, are basically XPath tests; though I don't know a
>good XPath test suite yet either. If you have time and inclination to start
>writing a set of cases and a test document to run them on (with something
>that says the expected results...), I'm sure a lot of people would be happy
>to help suggest new cases to add, try it out, etc.... Plus you'd make the
>XSL and Linking Working Groups feel indebted, and contribute significantly
>to progress...
>
>                         *               *               *
>
>[Steve DeRose responding to Michael Dyck:]
>At 6:06 AM +0000 9/19/00, Michael Dyck wrote:
> >One problem is: how do you know whether the XPath/XPointer implementation
> >got the right answer? (How do you distinguish the right answer from a
> >wrong answer? What is the form of the answers?)
>
>Well, the test designer could come with doc on what the right answer is. A
>reasonable output would be to give a simply-formatted list of nodes and
>offsets, perhaps expressed via child-sequences, that makes up the resulting
>location-set:
>
><results time="...">
>    <test>
>       <orig-query> xpointer(id(foo))</orig-query>
>       <loc-set>
>          <node path="/1/5/4/7"/>
>       </loc-set>
>    </test>
></results>
>
>something like that, with node and test of course repeatable.
>For ranges you'd want an alternative to node, such as
>
>    <range start-node="/1/4/2" start-offset="37"
>           end-node="/1/4/5" end-offset="1"/>
>
>And one could add the content of the node or range as literal content to make
>for more readability. A DTD like this would enable remote testing over the
>Web, as the results could be displayed with a little CSS in browsers, and
>the links could be made active to take you to the actual places.
>
> >
> >-- For instance, if the implementation just prints out the results (their
> >string value or the document chunk that they arise from), there may be
> >wrong answers that look the same as the right answer. (e.g., the third
> >occurrence of "foo" will look the same as the negative-third occurrence of
> >it.) Plus, there won't be anything printed for points and collapsed
> >ranges.
>
>Yes, it's definitely got to not just give content.
>
> >
> >-- Similarly if the implementation highlights the results in a visual
> >display of the document. (Points which are distinct in the data model map
> >to the same position in the document. How do you distinguish a non-range
> >location from a range that happens to have the same extent?) Moreover, the
> >testing couldn't be automated (although perhaps that isn't a big concern).
>
>Right; like in most things, looking at the rendered result is not enough to
>tell what you've got. So the test harness must report more than that.
>Reducing to child-sequences and offsets seems the easiest way; this would
>not automatically catch the case where the implementation messes up
>child-sequences in a way precisely like it also messes up something else;
>but that's pretty unlikely.
>
> >
> >-- I think the only foolproof testing scheme would involve using a
> >programmatic interface to the implementation. Unfortunately, you then have
> >to either set up a different test harness for each implementation, or
> >mandate an API that each implementation must fulfill [perhaps among
> >others] for conformance-testing purposes, which is probably not within
> >your terms of reference.
>
>Well, by definition there is not truly foolproof testing scheme; but ones
>like i've sketched above have worked pretty well in practice. Actually, a
>slight tweak to the approach above makes it easy to automate: you put the
>test to be executed in there (as it is), but then follow it with the
>*expected* result. Then a simple harness (or even diff) checks it for you.
>
>                         *               *               *
>
>[Michael Dyck responding to Steve DeRose:]
>Steve DeRose wrote:
> >
> > the test designer could come with doc on what the right answer is. A
> > reasonable output would be to give a simply-formatted list of nodes and
> > offsets, perhaps expressed via child-sequences, that makes up the
> > resulting location-set:
> >
> > <results time="...">
> >    <test>
> >       <orig-query> xpointer(id(foo))</orig-query>
> >       <loc-set>
> >          <node path="/1/5/4/7"/>
> >       </loc-set>
> >    </test>
> > </results>
> >
> > something like that, with node and test of course repeatable.
> > For ranges you'd want an alternative to node, such as
> >
> >    <range start-node="/1/4/2" start-offset="37"
> >           end-node="/1/4/5" end-offset="1"/>
> >
> > ...
> > Reducing to child-sequences and offsets seems the easiest way; this
> > would not automatically catch the case where the implemention messes up
> > child-sequences in a way precisely like it also messes up something
> > else; but that's pretty unlikely.
>
>Yup, that's a pretty good idea. The only problem I can see is that a child
>sequence can only locate an element, so you'd have to use a different
>mechanism if you need to locate a non-element node. I suppose you could
>specify a little extension to child-sequences for this. (Note that this
>extension would only be recognized/generated by the test harness, not the
>XPointer processor proper.)
>
> > >-- I think the only foolproof testing scheme would involve using a
> > >programmatic interface to the implementation. Unfortunately, you then
> > >have to either set up a different test harness for each implementation,
> > >or mandate an API that each implementation must fulfill [perhaps among
> > >others] for conformance-testing purposes, which is probably not within
> > >your terms of reference.
> >
> > Well, by definition there is not truly foolproof testing scheme;
>
>Quite so. What I *meant* was "the only foolproof way to know whether the
>XPointer processor has produced the correct result for a test", which I
>still think would require a programmatic interface. But your suggestion is
>pretty close on that score, and better on most other counts.
>
>One other thing: I'm wondering about the "subject document" (the document
>that provides the initial context for evaluating the XPointer).  Would it
>be implicit, or would the test-suite document specify it once, or might
>there be a different subject for each <test>? (Many could use the same
>one, of course, but there might not be a single subject that could serve
>for all tests.) If each <test> needs to specify a subject document and an
>XPointer, maybe the <orig-query> should be a URI reference. But what if
>someone wants to test their XPointer processor offline?
Received on Monday, 9 October 2000 09:59:49 UTC