Re: SPARQL Query Results XML Format from Dave Beckett on 2005-06-15 (public-rdf-dawg-comments@w3.org from June 2005)

From: Dave Beckett <dave.beckett@bristol.ac.uk>
Date: Wed, 15 Jun 2005 09:53:10 +0100
To: Ian Harrison <harrison@AI.SRI.COM>
Cc: public-rdf-dawg-comments@w3.org
Message-ID: <20050615095310.60cae59b@hoth.ilrt.bris.ac.uk>

On Tue, 14 Jun 2005 11:59:03 -0400, Ian Harrison <harrison@AI.SRI.COM> wrote:

> 
> Have you considered having more meta-data in the head element, e.g. to 
> record the query (or ref to query), plus dataset (or ref to dataset) 
> queried). Otherwise there doesn't seem to be any information identifying 
> where the result bindings came from?

The WG hasn't consider this, as far as I recall.  Do you mean identifying
the query and dataset (say by URIs) or including it, such as a literal
string for the query, something else for the dataset?

> The motivation for this comes from work we're doing. In that work we do 
> graph matching over a dataset, which (although currently stored in a 
> relational database)  is a set of triples, such as (memberOfGroup 
> Person1 Group1). We'd like to be able to store the query results for 
> later retrieval and want to have provenance of the results, for several 
> reasons
> 
> 1) To know which graph matching application made the match, so we can 
> compare results between different pattern matchers

That's recording the application, not the graph(dataset) or query - is
that a different use case?

> 2) To know which dataset we matched against, so that if the dataset 
> changes then we'd be able to understand why results might differ 
> (ideally we'd want provenance not jyust at the high-level name/id of 
> dataset, but at the individual data triple level too).

From what you say earlier, to you a dataset is a reference to a database
which is a single graph.  SPARQL's dataset is a slightly different and
wider concept.

> 3) To know what pattern matching parameters were set. These include 
> obvious things like max number of results to return, or a time point/no 
> of cpu cycles to expend on the query. This also includes, for inexaxt or 
> approximate graph pattern matchers things like maximum cost, which are 
> used if you have something like an ontological edit distance matching 
> algorithm.

This sounds like a richer description requirement.  Could this be
met by external metadata, pointed to from the query results by a URI?

> 4) Results from one application might be used as (port of) a query to 
> another application. This could occur if you had a serial workflow, when 
> each application had a specialised capability (e.g. group finding), 
> whereas the next application was an event matcher (with temporal 
> constraint checking). Alternatively you might have a parallel workflow, 
> where you divide the problem in parts (sub-graphs) , task to separate 
> applications, and then merge the results all back together.

Thanks for your feedback.

Dave

Received on Wednesday, 15 June 2005 08:54:24 UTC