RE: Testing the API from Rotan Hanrahan on 2008-01-31 (public-ddwg@w3.org from January 2008)

From: Rotan Hanrahan <rotan.hanrahan@mobileaware.com>
Date: Thu, 31 Jan 2008 06:35:58 -0500
To: "Jo Rabin" <jrabin@mtld.mobi>, <public-ddwg@w3.org>
Message-ID: <D5306DC72D165F488F56A9E43F2045D3017848C0@FTO.mobileaware.com>
(Migrating to the public list)

I am thinking along the lines of common functional and unit tests that the various implementers can employ along the path towards creating their respective implementations. Whether such an approach is too much or too little for a conformance test remains to be seen, though the distinction you make is interesting. The question is whether it is enough for the implementation's signature to match the specification, or if conformance includes the behaviour when such an implementation is exercised.

A good test for any implementation would be robust, exercising each behaviour and expectation described within the specification. I would contend, however, that the conformance should include relationships (where practical) between the queries that are made and the data that is known to be available. In this sense, I would say that an implementation of the DDR API that exposed all of the listed normative methods as specified, yet returned nonsense data when it is known that specific data was placed in the back-end corpus, is not a conforming implementation. It may exhibit the correct signature characteristics, but certainly doesn't exhibit the expected behaviour.

So, if I place the value 240 into a data collection to represent the screen width of a device whose evidence is also predetermined, and I subsequently retrieve the screen width using that very same evidence, I expect not only an integer to be returned, but that the value of that integer is 240.

However, if we determine that the conformance requirements of the DDR API do not have to cover the above case (i.e. behavioural aspects), then I believe we need to separately describe the sensible behaviour somewhere, so that an implementation can be declared both "conformant" and "sensible". (I use the term 'sensible' in a loose fashion, and would be happy for someone else to formalise it.)

I have already requested some guidance via the Team to identify the appropriate manner in which to capture conformance within the specification, and will also be looking to existing specifications for examples. The ones that you identify are high on my list.

To take an example from DCCI as you suggested: In DCCI, the "value" attribute in a provider of the DCCIProperty interface holds the value of a property whose type is given by the "propertyType" attribute, which itself denotes the type of client property being referenced. A "presentation property" (e.g. some information associated with the screen) is one such property that may be supported in an implementation. Furthermore, one may search for such properties using the hasProperty() method by providing the namespace and name (via namespaceURI and propertyName parameters) and the method is required to return information as to whether the property was present. These characteristics and behaviour of the DCCI interface (and many more such characteristics and behaviours) are all normatively defined in the DCCI specification. Furthermore, the *first* conformance requirement specified in section 7.1 of the DCCI specification says: "A conforming DCCI implementation must implement all the normative sections of this document." It follows that the conformance requirements of DCCI include both characteristics and behaviours of the constituent normative interfaces. I approve of this approach, and would be happy to support the same approach in the conformance requirements of the DDR API, and in this regard I believe (some of) the functional/behavioural tests I proposed could be appropriate to support the conformance requirements of the DDR API, where they can reference normative characteristics/behaviours.

Regarding item (2) on your list, I fully agree. I do not believe we can say anything about the completeness or accuracy of the data behind the interface. For this reason, I proposed that tests of behaviour should be based on "virtual devices" and their "invented characteristics" and corresponding "predetermined yet fake evidence". At least that way the tests are independent of any futile attempts to have complete and accurate data.

---Rotan.



-----Original Message-----
From: member-ddwg-request@w3.org [mailto:member-ddwg-request@w3.org] On Behalf Of Jo Rabin
Sent: 30 January 2008 19:13
To: DDWG
Subject: RE: Testing the API


Note for tracker - this is ACTION-82.

Rotan, my concern is that this is a functional test of the API not a conformance statement. 

We need to be careful to distinguish the following:

1. conformance to the API spec
2. completeness and accuracy of the database, known to be in practice impossible to get 100% complete and accurate
3. accuracy of the operation of the code (measured by precision and recall) - i.e. given a known corpus of data, how likely is the interface to return the correct data and only the correct data given a query containing sufficient information in principle to find the relevant data and return it correctly.

This last is what you are proposing and doesn't to my mind constitute a conformance test - especially since what it is testing is primarily the operation of the power of the recognition algorithm.

This is a bit like testing a search engine by demanding 100% precision and recall for a particular query that contains known keywords, known to be contained in a couple of documents with which a test corpus has been seeded. That is not a test of a search engine's conformant implementation of a search API.

I'd prefer to see as the conformance criterion something a bit more along the lines of what is suggested in DCCI:

http://www.w3.org/TR/DPF/#iddiv296456272

which (correctly in my view) describes what has been implemented but says nothing at all about whether or not the implementation returns complete nonsense.

In terms of a conformance test suite, we might publish a piece of code that exercises all the available methods and determines that exceptions are/are not thrown in certain cases where a priori they are defined as being thrown. 

So for example:

1. Initialise DDR with empty database
2. Request the property "foo" and check that the EXCEPTION_COMPONENT_NOT_FOUND is thrown

This is tricky though, as if you load an empty database then it is an implementation issue as to whether the COMPONENT NOT FOUND or the UNKNOWN_PROPERTY_NAME exception are thrown.

(I use these exception names for illustration only - I didn't cross refer to what we actually decided they were called, and actually, do we have an exception identifying that recognition has failed?)

Hence, I am not sure that this is all that worthwhile, but I wouldn't want to stop members trying to develop it if they felt moved to do so. I would not want the document to say this was the conformance criterion. 

Any such checker would be the implementation of a conformance statement rather than a requirement of the conformance statement. 

(and PS this correspondence should be on the public list, I think)

Cheers
Jo



> -----Original Message-----
> From: member-ddwg-request@w3.org [mailto:member-ddwg-request@w3.org] On
> Behalf Of José Manuel Cantera Fonseca
> Sent: 30 January 2008 15:26
> To: Sullivan, Bryan
> Cc: DDWG
> Subject: Re: Testing the API
> 
> 
> +1 too!!
> 
> Sullivan, Bryan escribió:
> > Rotan,
> > Sounds good.
> >
> > *Bryan*
> >
> > ------------------------------------------------------------------------
> > *From:* member-ddwg-request@w3.org [mailto:member-ddwg-request@w3.org]
> > *On Behalf Of *Rotan Hanrahan
> > *Sent:* Wednesday, January 30, 2008 3:48 AM
> > *To:* DDWG
> > *Subject:* Testing the API
> >
> > One of the things we mentioned in the API-fest was how to test if
> > implementations were conforming to the specification. Here is an idea
> > I put on the table:
> >
> > We invent a virtual device or two, and give them properties according
> > to the DDR Core Vocabulary. We leave one or two properties blank, to
> > represent the reality of real-life repositories which are naturally
> > incomplete.
> >
> > We then devise a set of content adaptations that exploit the various
> > known properties, defaulting in the case of unavailable data, and
> > which exercise every one of the methods in the API.
> >
> > Finally we apply these adaptations to specified input content, and
> > note our expected output on the virtual device(s).
> >
> > This is now a test of an implementation, which we make normative.
> >
> > The implementation under test must include in its repository the
> > information for our virtual device(s). We would then present the
> > implementation with the evidence corresponding to our virtual
> > device(s) and confirm that the adapted results correspond with our
> > expected results.
> >
> > Does this sound like a workable approach, or is something more
> > sophisticated required?
> >
> > ---Rotan.
> >
> 
>
Received on Thursday, 31 January 2008 11:38:04 UTC