RE: RDF query testcase requirements: IRC chat? from Geoff Chappell on 2003-02-27 (www-rdf-rules@w3.org from February 2003)

From: Geoff Chappell <geoff@sover.net>
Date: Thu, 27 Feb 2003 07:19:46 -0500
To: "'Libby Miller'" <Libby.Miller@bristol.ac.uk>
Cc: <www-rdf-rules@w3.org>
Message-ID: <006101c2de5f$71dcf280$835ec6d1@GSCLAPTOP>
Hi Libby,

I'm not going to make it to the meeting but I thought I'd put my two
cents in anyway. Comments below.

Rgds,

Geoff

> -----Original Message-----
> From: www-rdf-rules-request@w3.org
[mailto:www-rdf-rules-request@w3.org]
> On Behalf Of Libby Miller
> Sent: Tuesday, February 25, 2003 9:08 AM
> To: Libby Miller
> Cc: www-rdf-rules@w3.org
> Subject: Re: RDF query testcase requirements: IRC chat?
> 
> 
> 
 [...]
> 
> -----
> Aim of the meeting
> 
> To start to draft a manifest format for RDF query testcases which can
> be used with multiple query syntaxes and perhaps multiple resultset
> formats. I propose that we use the following manifest format as
> something to start with, although I'm well aware that it's a bit
rough:
> 
> http://swordfish.rdfweb.org/rdfquery/tests/query-results-manifest.rdf
> It is likely that the manifest format will not be suitable for all RDF
> query languages, although we will try to be as inclusive as possible.
> 
> -----
> Proposed agenda
> Please contact me or the list if you would like to add anything.
> 
> 
> 1. Requirements for a RDF query testcase format.
> 
> 1.1 What sorts of things do we need in a RDF query testcase manifest
> format? Does
> 
> http://swordfish.rdfweb.org/rdfquery/tests/query-results-manifest.rdf
> 
> sufficiently cover everyone's needs?


It's hard to say much about the format without first answering 1.2,
but...

- shouldn't the various urls be literal values (a la rss:link)
- seems awkward to have result partially specified (numberRows) in the
manifest and partially in another file. Either there's a single result
set format that could include a row count if desired, or there are many
different result set formats some of which may not even have rows (e.g.
graph/triple output). Or are you trying to identify some common
characteristics that all result sets must have?

 
> 1.2 What do we intend to do with testcases expressed in this format?

Possibilities include:
- cataloging how different query languages perform similar queries as a
means of gathering structured use cases for a future common rdf query
language or query mechanism
- automated testing
- query rdf data and retrieve results in a std format (this assumes the
testcase format specifies a common query and result format and would be
more a result of those aspects than of the testcase as a whole)
- ...?
 
> 1.3 Is RDF the best syntax?

Seems like a reasonable place to start considering the crowd and
available tools, but further requirements definition could make it
awkward (e.g. if multiple rdf sources need to be stored and
distinguished in the same file)

> 
> 2. Describing the query itself in the manifest format.
> 
> 2.1 Is linking to a file or url for the query sufficient? or do we
also
> need to provide for in-line queries?

Inline queries seem more useful if a common query description is used.
 
> 2.2 How can be best describe the different syntactic formats we will
> have for the queries?

I'd guess that they're either just named opaque blobs (with names/uris
provided by the query language owners), or there's enough commonality
found to use a common description (with some languages possibly not
supporting all vocab items). I guess there could be some middle ground -
e.g. come up with a vocabulary to describe a query language.

Maybe we could specify the native query syntax by reference (by url) and
have a common vocab description of the query inline. It wouldn't be too
hard to put together a vocab for a large subset of the query languages
out there. Something along the lines of:

Query has:
Variable selection (projection)
Source(s)
Condition

Condition has:
1 or more logical term (disjunctive terms)

logical term has:
1 or more logical factors (conjunctive factors)

logical factor has:
triple pattern or operator expression or condition

We could just stop there and handle the majority of query languages. 


> 3. Describing the resultset formats.
> 
> 3.1 Again, there may be different resultset formats, for example Andy
> Seaborne's:
> 
> http://lists.w3.org/Archives/Public/public-esw/2003Jan/0046.html

A few comments on the format:
- it would be good to be able to preserve order when a query language is
capable of ordering results (e.g. SELECT ?x, ?y .... ORDER BY ?x)
- it would be good to make clear the meaning of the predicates used
(variables, value) to be sure we're talking about a thing when we mean
to, and its name when we mean to.
 
> do we need to be able to distinguish between these?
> 
> 3.2 Relatedly, how much about the results should go into the resultset
> and how much in the manifest, for example the variable names, number
of
> rows returned?
> 
> 
> 4. Source files.
> 
> 4.1 These may be in different formats (RDF/XML, N-triples), do we need
> to distinguish them?
> 
> 4.2 Should we allow more than one source file?
> 
> ------
> 
> This is too much for one meeting, but I've put it all here so people
can
> see the sort of approach we might take.
> 
> cheers
> 
> Libby
Received on Thursday, 27 February 2003 07:55:27 UTC