Re: Testable assertion tagging for W3C specifications from David Marston/Cambridge/IBM on 2002-05-29 (www-qa@w3.org from May 2002)

From: David Marston/Cambridge/IBM <david_marston@us.ibm.com>
Date: Wed, 29 May 2002 17:02:06 -0400
To: www-qa@w3.org
Cc: "Scott Boag/Cambridge/IBM" <scott_boag@us.ibm.com>
Message-ID: <OF772CF381.3228364A-ON85256BC8.00701252@lotus.com>
In the dialog below, > is Alex Rousskov.

>If the addressing scheme can isolate/address any sequential piece of
>text (or XML fragment?), then it can isolate/address those productions,
>assertions, etc.

I want it to address them in a way that recognizes that they *are*
productions, assertions, or whatever. Based on your verbiage above,
I gather that you are referring to a text range, delimited at both
start and end. Now, let's look at typical normative source under the
current practice. Quoting section 7.7 of XSLT:

<li>
<p>When <code>level="any"</code>, it constructs a list of length
one containing the number of nodes that match the <code>count</code>
pattern and belong to the set containing the current node and all
nodes at any level of the document that are before the current node in
document order, excluding any namespace and attribute nodes (in other
words the union of the members of the <code>preceding</code> and
<code>ancestor-or-self</code> axes). If the <code>from</code>
attribute is specified, then only nodes after the first node before
the current node that match the <code>from</code> pattern are
considered.</p>
</li>

You have to go outside the enclosing <ul> and read prose to realize
that all the above is only applicable *if* there is no "value"
attribute set, so no sentence in the above can be quoted standalone.
But you can't even quote sentences without resorting to substring()
and relying on non-varying character counts. What you can isolate
is a whole text node or a <code> element, so you can get ungainly
sequences with dangling sentence fragments. For example, to specify
the interaction of level="any" and specified from, without just
quoting the entire <p>, you get:
" axes). If the <code>from</code>
attribute is specified, then only nodes after the first node before
the current node that match the <code>from</code> pattern are
considered."
The above is 5 sequential descendants of the <p>: text()[5]
through text()[7].

>However, the addressing scheme in question does not need to know
>what those [non-universal] markers are in advance.

As I mentioned earlier in this thread, this is about more than just
a one-way link. We should try to identify the passage as something
higher-level than just text()[5] through text()[7]. We should be
able to say that we are pointing at the sentence that describes
the interaction of two attributes. It is simply not worth the
effort to copiously tag the tests if they say nothing about the
substance of the target text.

I mentioned before about discerning that two tests address the same
interaction of attributes. If that is too adventurous for now, would
you at least agree that we ought to be able to take a collection of
tests, follow their citation links, and see how much of the verbiage
is covered by the tests? (Picture a Recommendation with all "covered"
sentences having a light green background. You could eye-scan the Rec,
applying human intelligence, to see which of the uncovered portions
look like testable sentences.)

>If QAWG is inclined to develop and support a few nice DTDs with
>convenient markers, that's fine,...

<rant>DTDs are history! No new document architecture should use them!
</rant>

>My understanding is that no tag taxonomy is needed to address pieces
>of text or XML. We are not talking about the process of automatically
>locating assertions in a random Recommendation, are we? While fun,
>that task seems to have little utility.

This is the pivot of our disagreement. To me, "addressing pieces of
text" is the activity that has little utility. We *may* want to locate
assertions in any Rec, but (as you say) we are more likely to want to
locate assertions in designated Recs. It's more than just test suites
in the overall plan, though. We already have some Recs citing other
Recs down to the level of individual productions, which is beneficial.
Recs should be citable at the granularity of any sentence, equation,
or similar construct that normatively expresses a complete thought.
Test cases would make use of that capability, both so that you can
understand what a test case is doing and so that you can assess
coverage.

Scott Boag and I are discussing ways to advance to a better state of
citability, which has several uses in developing the Rec as well as
the test cases.
.................David Marston
Received on Wednesday, 29 May 2002 17:25:18 UTC