- From: Alistair Miles <alistair.miles@zoo.ox.ac.uk>
- Date: Tue, 6 Jan 2009 07:43:27 +0000
- To: Bernhard Schandl <bernhard.schandl@univie.ac.at>
- Cc: Yves Raimond <yves.raimond@gmail.com>, Richard Cyganiak <richard@cyganiak.de>, Aldo Bucchi <aldo.bucchi@gmail.com>, "public-lod@w3.org" <public-lod@w3.org>
This is really a side note to this very interesting discussion, but I thought I'd mention that in the FlyWeb Project, for each SPARQL endpoint that we publish, we also set up a test harness which is essentially a set of SPARQL ASK queries and an expectation for each query (either true or false, obviously). We then use the test harness to "validate" whether the dataset provided via the endpoint meets our expectations. However, the tests (ASK queries plus expected results) effectively define the "application profile" of vocabularies used by the dataset. I don't know how this could help automate the discovery of appropriate services and/or additional documents to dereference, but thought it might be interesting. Our flyweb test harnesses are actually implemented in javascript, but of course you could do it in any language. For an example, see [1] which defines tests over the flyatlas dataset [2]. Cheers, Alistair [1] http://code.google.com/p/flyui/source/browse/trunk/data/test-flyatlas.js [2] http://code.google.com/p/openflydata/wiki/Flyatlas On Mon, Jan 05, 2009 at 12:19:15PM +0100, Bernhard Schandl wrote: > > Hi Yves, > >> Indeed, that's a bad example - replace it by "find here persons born >> in NYC and their birth date". It is easy enough to find examples that >> involve more than just one property in the target document, e.g. "Find >> here female scientists born in NYC", "Find here the phone numbers of >> the Tabulator's developers", "Find the start time of chords on that >> audio signal", "Find here my latitude and longitude and the time at >> which they were captured"... > > What is the advantage of publishing examples instead of just pointing to > the vocabularies used in the data sets? I think it might be difficult to > find representative examples for, let's say, dbpedia data: chances are > high that you miss some aspects. > > Also what is the point of providing explicit examples instead of just > ASKing the endpoint if it returns useful data? > > I think it might be sufficient to just publish which vocabularies are > used by a certain endpoint. Even dbpedia uses a restricted set of > vocabularies, so if a client would know in advance which vocabularies > are used, it could decide if the data returned from this endpoint is > useful. This could be even more restricted to publishing "application > profiles" of vocabularies; i.e., subsets of the vocabularies that are > actually used within a dataset. > > Best regards, Bernhard > > -- Alistair Miles Senior Computing Officer Image Bioinformatics Research Group Department of Zoology The Tinbergen Building University of Oxford South Parks Road Oxford OX1 3PS United Kingdom Web: http://purl.org/net/aliman Email: alistair.miles@zoo.ox.ac.uk Tel: +44 (0)1865 281993
Received on Tuesday, 6 January 2009 07:44:08 UTC