RE: Requirement: queries written as RDF from Dirk Colaert on 2004-04-08 (public-rdf-dawg@w3.org from April to June 2004)

From: Dirk Colaert <Dirk.Colaert@quadrat.be>
Date: Thu, 8 Apr 2004 14:21:19 +0200
To: 'Rob Shearer' <Rob.Shearer@networkinference.com>, Patrick Stickler <patrick.stickler@nokia.com>
Cc: public-rdf-dawg@w3.org
Message-ID: <8C62F6881FC8D511BC52009027DC836E0200D819@SKYWALKER>
Don't we have to distinguish the expressiveness and the completeness of the
language on one side and the formal syntax on the other side?

If we take the example of the 1 to 1 mapping between RDF and n3 then we can
say that our Query Language ultimately must be convertible to true RDF. This
would allow us to define more human readable dialects of our Query language.

Another possibility is to use a subset of RDF like OWL. This would allow
humans to express their queries on a more semantical level.

I sympathize with both approaches, mentioned by Patrick and Rob. I feel like
it would be an advantage to stick to pure RDF, but I would hate to write RDF
queries. So, by making our query language convertible to RDF or be a subset
of it we can profit from the advantages of both approaches.

Compare with SQL
Select name, birthday from authors, books where ID = title.author

What could be more simple? My hope is to come to a query language with the
simplicity of SQL and the expressiveness of RDF...

Dirk

-----Original Message-----
From: Rob Shearer [mailto:Rob.Shearer@networkinference.com] 
Sent: lundi 5 avril 2004 18:55
To: Patrick Stickler
Cc: public-rdf-dawg@w3.org
Subject: RE: Requirement: queries written as RDF


> 1. Our target is an RDF graph. The formal semantics for such a target
> graph is provided by the RDF MT. If our queries are also expressed
> as RDF graphs, we are able to (a) maximize the intersection of target
> and query semantics, and (b) get a head start in defining the complete
> semantics of a query.

I'm afraid I don't buy this one at all. RDF is nothing but a syntax for
encoding graphs. Believe it or not, just about *every* language encodes
graphs; they are simply less expressive than RDF, so they can't encode
every possible graph.
Unless we're talking about entailment, I don't think we're really
suggesting that *any* RDF graph is a query, we we're limiting the
possible graphs anyway.

If you're looking for semantics on RDF graphs, I'm afraid you've got to
turn to RDFS and OWL. 
Queries encoded as OWL really do have useful semantics: e.g. an instance
representing a variable which is a member of some class and inherits
from some existential to a nominal of another instance, etc., would then
have well-defined meaning in terms of what could fill that variable.

No semantics for RDF means no advantage to using it.

> 2. Clients submitting DAWG queries and processing DAWG results will
> (almost without exception) be capable of constructing, manipulating,
> and utilizing RDF graphs -- typically via a software API. By encoding
> queries (and results, BTW) as RDF, this results in reduced 
> implementational
> requirements for DAWG clients, as well as a more consistent solution
> environment (i.e. fewer parsers, serializations, APIs, logical 
> operations,
> etc.).

Whoah- giant giant leap here.
Just how do you see people getting this abstract "RDF processing"
functionality?
I was rather thinking that engineers armed with an RDF query language
wouldn't need any other RDF components to get at the data they want. If
you've got to be able to process RDF in order to use our language to
query RDF, then what's the bloody point?

> 3. By expressing queries in RDF, the queries themselves fall within
> the target scope of DAWG, which (barring careless syndication of
> query graphs with other graphs) will offer alot of utility.

This is the only argument I really understand, and I understand interest
in storing queries in RDF, which is the cool new way to store anything.
However,

1) This concern is academic; in real life users are unlikely to care
very deeply. Witness the world's complete lack of interest in XQueryX.

2) We still don't have any use cases at all demonstrating this user
need, and I think we should at the very least come up with a couple of
use cases for any requirement.

> 4. Clients which wish to submit a query to a DAWG repository which is
> known to not support any inference could employ a reasoner to 
> pre-expand
> the query into a set of alternates and submit each to the repository,
> syndicating the results (or this could be provided by a specialized
> proxy-based query broker). Granted, one could do this if queries were
> expressed in other ways, but one would first have to map the query to
> RDF, reason about it, and then remap it back to the other form. Much
> better to use RDF from the start.

I have architected both inference engines and query brokers to perform
the kinds of transformations you describe, and I have never been at all
tempted to use RDF internally as an intermediary.
Any query broker is going to have to be able to process the query
language, that language is going to be a lot more specific than RDF, and
you're going to need more custom logic than generic RDF processing.
Using RDF as syntax just means that you've got to parse RDF instead of
parsing something else, and to be honest RDF is one of the most awkward
languages to parse at this point in history. (Okay, Perl, but you get
the idea...)

> 5. Since RDF already provides for arbitrary datatypes, that 
> requirement
> is automatically met by using RDF to express queries.

That's not an excuse for a new requirement; it's a solution for a
different requirement. Don't confuse the two.

> Some use cases (already submitted):
> 
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar
> /0056.html

The words "let us assume that the input query is expressed in RDF" don't
make this use case demonstrate the requirement. Again, query brokering
can be performed quite independently of RDF. I can meet this use case
without encoding queries as RDF.

> http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar
> /0073.html

Again, I don't see how this demonstrates any requirement for queries in
RDF. Someone wants to know about people between 16 and 18 years of age.
There are a ridiculous number of ways to meet this use case which don't
encode queries in RDF.

> http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar
> /0074.html

This use case certainly seems to dictate that queries are RDF, but
that's the premise of the use case. It's like writing a use case saying
"a user writes a query in natural language French and gets some answers"
and then saying "see? we need a natural language processor for French!"

I actually don't really consider this a use case at all; it's just a
little proto-requirement. There's never any mention of just what the
user is trying to do; just the presumption that the query language will
work in a particular way. We might as well have use cases along the
lines of "a user writes a query in natural language French and gets the
answers she wants".

A use case is a real problem a user has *now*, in the absence of a query
language, that a query language could help solve. No user is using an
RDF API to write a query in our language, because there is no language.
Received on Thursday, 8 April 2004 08:14:38 UTC