Re: Granular dereferencing ( prop by prop ) using REST + LinkedData; Ideas? from Alistair Miles on 2009-01-06 (public-lod@w3.org from January 2009)

From: Alistair Miles <alistair.miles@zoo.ox.ac.uk>
Date: Tue, 6 Jan 2009 07:43:27 +0000
To: Bernhard Schandl <bernhard.schandl@univie.ac.at>
Cc: Yves Raimond <yves.raimond@gmail.com>, Richard Cyganiak <richard@cyganiak.de>, Aldo Bucchi <aldo.bucchi@gmail.com>, "public-lod@w3.org" <public-lod@w3.org>
Message-ID: <20090106074327.GB8567@skiathos>

This is really a side note to this very interesting discussion, but I
thought I'd mention that in the FlyWeb Project, for each SPARQL
endpoint that we publish, we also set up a test harness which is
essentially a set of SPARQL ASK queries and an expectation for each
query (either true or false, obviously). 

We then use the test harness to "validate" whether the dataset
provided via the endpoint meets our expectations. However, the tests
(ASK queries plus expected results) effectively define the
"application profile" of vocabularies used by the dataset.

I don't know how this could help automate the discovery of appropriate
services and/or additional documents to dereference, but thought it
might be interesting.

Our flyweb test harnesses are actually implemented in javascript, but
of course you could do it in any language. For an example, see [1]
which defines tests over the flyatlas dataset [2].

Cheers,

Alistair

[1] http://code.google.com/p/flyui/source/browse/trunk/data/test-flyatlas.js
[2] http://code.google.com/p/openflydata/wiki/Flyatlas

On Mon, Jan 05, 2009 at 12:19:15PM +0100, Bernhard Schandl wrote:
>
> Hi Yves,
>
>> Indeed, that's a bad example - replace it by "find here persons born
>> in NYC and their birth date". It is easy enough to find examples that
>> involve more than just one property in the target document, e.g. "Find
>> here female scientists born in NYC", "Find here the phone numbers of
>> the Tabulator's developers", "Find the start time of chords on that
>> audio signal", "Find here my latitude and longitude and the time at
>> which they were captured"...
>
> What is the advantage of publishing examples instead of just pointing to 
> the vocabularies used in the data sets? I think it might be difficult to 
> find representative examples for, let's say, dbpedia data: chances are 
> high that you miss some aspects.
>
> Also what is the point of providing explicit examples instead of just  
> ASKing the endpoint if it returns useful data?
>
> I think it might be sufficient to just publish which vocabularies are  
> used by a certain endpoint. Even dbpedia uses a restricted set of  
> vocabularies, so if a client would know in advance which vocabularies  
> are used, it could decide if the data returned from this endpoint is  
> useful. This could be even more restricted to publishing "application  
> profiles" of vocabularies; i.e., subsets of the vocabularies that are  
> actually used within a dataset.
>
> Best regards, Bernhard
>
>

-- 
Alistair Miles
Senior Computing Officer
Image Bioinformatics Research Group
Department of Zoology
The Tinbergen Building
University of Oxford
South Parks Road
Oxford
OX1 3PS
United Kingdom
Web: http://purl.org/net/aliman
Email: alistair.miles@zoo.ox.ac.uk
Tel: +44 (0)1865 281993

Received on Tuesday, 6 January 2009 07:44:08 UTC