Re: Specifying RDF Datasets

On Wed, Jul 27, 2005 at 01:32:53PM -0400, Ron Alford wrote:
> There seems to be a discrepency between specifying RDF Datasets in the
> query and in the protocol.

[snip]

> It appears you can specify multiple uris to be merged into the default
> graph in the query, but can only specify one default graph uri in the
> protocol parameters.

This has been addressed in the latest editor's draft of the protocol spec:

http://www.w3.org/2001/sw/DataAccess/proto-wd/
$Revision: 1.53 $ of $Date: 2005/07/27 19:42:14 $

$Log: Overview.html,v $
Revision 1.53  2005/07/27 19:42:14  kclark
...
- synch'd rdf dataset with rq23
...

> 1) Should the protocol allow multiple default-graph-uri parameters?

Yes, and now it does. Thanks for yr comment.

> 2) Where, if anywhere, should the protocol document be referring to IRI
> instead of URI?

The protocol document never refers to URI; that string doesn't appear in the
document at all. In general, however, I believe the ideal is to follow WSDL
2 in this regard, since it's the means by which the protocol is being
specified. 

If you have a more specific concern or issue, I'd be happy to learn of it.

> 3) (Mostly disjoint from the above) Would it be appropriate for an
> implementation to provide a default dataset for clients to query (this
> would answer "(can there be zero datasets?)")?

Yes, I believe so. In general, in my view, there are at least two kinds of
SPARQL query processing services, which we can call "publishers" and
"processors" as a kind of shorthand:

1. "publishers" expose a SPARQL query interface in order to communicate
information to clients, and they may well want to process every query with
some graph or graphs "in the background". (The analogue is to Web publishers
with a point of view, communicating some information in a machine-readable
way.)

2. "processors" expose a SPARQL query interface in order to process SPARQL
queries for some client; that is, they are like application service
providers in that their motivation is to process SPARQL queries, rather than
to expose some knowledge base to queries. (The model here is to ASPs that
lease or rent or loan CPU/disk/bandwidth resources to clients to perform
some computation. Think Amazon's Web Services Simple Queue or XSLT service.)

There can be, of course, different blends of (1) and (2), and there may be
other types as well.

I think -- again, personally -- it's safe to assume that an implementation
of (1) might provide a default or, better, an implicit dataset against which
bare queries (queries that do not specify a dataset) or bare query requests
(requests that contain a query with no dataset and which do not, at the
protocol level, specify a dataset) are evaluated.

It seems equally likely that (2)-type services might not provide an implicit
dataset.

All that being said, the query language spec has language that implies some
freedom on the part of implementations in this regard:

  If a query provides such a dataset description, then it is used in place
  of any dataset that the query service would use if no dataset description
  is provided in a query.
  
That is, for bare queries or query requests, a query service may use an
implicit dataset. Presumably it may also return the QueryRequestRefused
fault message, perhaps with an (optional) explanation that it does not
process requests without an explicit dataset.

Thanks, again, for yr comments.

Kendall Clark

Received on Thursday, 28 July 2005 14:53:15 UTC