Re: [Testdev] Testing the Data Model from Shane McCarron on 2016-04-22 (public-annotation@w3.org from April 2016)

From: Shane McCarron <shane@spec-ops.io>
Date: Fri, 22 Apr 2016 13:55:31 -0500
To: Discussions about Test Development <testdev@lists.spec-ops.io>, W3C Public Annotation List <public-annotation@w3.org>
Message-ID: <CAJdbnOCSibKCmsb=8gmNvZyZ_8xqqCJKzzeSEa0WyOr+Nh4wtw@mail.gmail.com>
Yes.  And hopefully the Schema / JSON-LD context already enforces these
prose assertions anyway.  The prose is mostly about data types and which
attributes should be present when this or that other attribute are present.


Note: There is another approach that is commonly used.  It feels a little
circular, but you generate examples by walking the grammar - then you feed
all the examples into validation tools.  In a sufficiently complex grammar
(such as yours where annotations can nest) the combinatoric explosion from
this is fairly dramatic.  You get A LOT of "tests".  I personally find this
to give a false sense of confidence. Just because millions of tests pass
doesn't mean that your implementation is solid.  It just means you didn't
think very hard about the edge cases.

Anyway, I think identifying the assertions in the document and then
ensuring there are declarative test files for each of them will go a long
way toward exercising the Data Model and its implementation(s) when run
against JSON Schema and JSON-LD validators.

P.S. It is notionally possible to annotate your spec source so that these
assertions are automatically extractable.  This has some obvious benefits,
but the hour is late and I am not suggesting we try to get that level of
infrastructure in place for Web Annotation Data Model.  We don't have time
to do it right.


On Fri, Apr 22, 2016 at 1:39 PM, Randall Leeds <randall@bleeds.info> wrote:

> This is very clarifying, thank you.
>
> I wasn't sure what was meant by testing a data model, but it makes sense.
> If I understand you correctly, calling out the testable assertions in the
> prose (we discussed on the call today) and then describing those we schema
> definitions, we we should then be able to assert that our examples are
> valid, and therefore at least that the spec is internally consistent.
>
> At least, that's what I'm getting from this.
>
> On Fri, Apr 22, 2016, 10:06 Shane McCarron <shane@spec-ops.io> wrote:
>
>> (CCing the Spec-Ops testdev mailing list)
>>
>> In the Web Annotation meeting today Doug touched on something important.
>> Apologies that I didn't follow up on it at the time.  Doug mentioned that
>> there are other ways of testing Data Models / grammars.  This is actually
>> pretty important, and might help us focus this effort.
>>
>> A little background.  The W3C develops a number of different types of
>> Recommendations.  You might divide these into "protocol", "grammar", and
>> "user agent".  These things are all part of the Web Platform.  When you
>> become a Candidate Recommendation at the W3C, the criteria for exiting
>> Candidate status include having the features of your Recommendation
>> supported by at least two implementations.  In that context, testing
>> protocols is well understood.  Testing user agent behavior is also well
>> understood.  Testing grammars?  Not so much.
>>
>> (For purposes of this email, let's pretend that a data model is just a
>> special case of a grammar.)
>>
>> What does it mean to have an "implementation" of a grammar? Arguably, the
>> "implementation" of a grammar is its expression in a meta grammar.  And it
>> is "implemented" by the working group.  In this case, the real test then is
>> whether that "implementation" is correct, and whether it can be consumed by
>> tools that process such a meta grammar.
>>
>> So, in the case of Web Annotation, you have a data model that is
>> expressed in prose, with a context defined in JSON-LD and (potentially) a
>> definition in JSON Schema.  So, one thing we could consider for CR exit
>> criteria is to have tests that verify the implementation of the grammar
>> adheres to the constraints in the prose PLUS verification that a set of
>> sample data files (the examples from the spec) were able to be validated
>> using the implementation by multiple tools that support JSON-LD / JSON
>> Schema validation.
>>
>> This, I think, is what Doug was trying to get it. We don't NEED to take
>> the output of real clients and ensure that they generate output that
>> conforms to the Data Model (unless we define user agent conformance
>> criteria).  We need to prove that the Data Model is complete. that its
>> definition is well formed (compiles/is parseable), and that it works.
>>
>> So, we should consider whether there is any value in going through the
>> effort of instrumenting the tests so that it is even possible to collect
>> output from clients and evaluate it.  It *should* be sufficient to
>> demonstrate that the Data Model works and that all of the types of
>> client-generated output can be validated against it.  And we can absolutely
>> do this sort of testing within the context of the Web Platform Tests (WPT).
>>
>> FWIW this is exactly what we did with XHTML Modularization many years
>> ago.  It was implemented in XML DTD and XML Schema.  We ensured that those
>> implementations were consumable by popular commercial and free tools that
>> did validation using DTD and Schema.  We also showed that there were
>> multiple independent markup languages that were developed by groups within
>> and outside of the W3C that used the modules.  That was sufficient to
>> satisfy the Director and exit CR.
>>
>> --
>> Shane McCarron
>> Projects Manager, Spec-Ops
>> _______________________________________________
>> Testdev mailing list
>> Testdev@lists.spec-ops.io
>> http://lists.spec-ops.io/listinfo.cgi/testdev-spec-ops.io
>>
>
> _______________________________________________
> Testdev mailing list
> Testdev@lists.spec-ops.io
> http://lists.spec-ops.io/listinfo.cgi/testdev-spec-ops.io
>
>


-- 
Shane McCarron
Projects Manager, Spec-Ops
Received on Friday, 22 April 2016 18:56:26 UTC