Re: Requirement: queries written as RDF from Patrick Stickler on 2004-04-06 (public-rdf-dawg@w3.org from April to June 2004)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Tue, 6 Apr 2004 10:02:59 +0300
To: "ext Rob Shearer" <Rob.Shearer@networkinference.com>
Cc: <public-rdf-dawg@w3.org>
Message-Id: <70773A58-8798-11D8-9CDC-000A95EAFCEA@nokia.com>
On Apr 05, 2004, at 19:55, ext Rob Shearer wrote:

>
>
>> 1. Our target is an RDF graph. The formal semantics for such a target
>> graph is provided by the RDF MT. If our queries are also expressed
>> as RDF graphs, we are able to (a) maximize the intersection of target
>> and query semantics, and (b) get a head start in defining the complete
>> semantics of a query.
>
> I'm afraid I don't buy this one at all. RDF is nothing but a syntax for
> encoding graphs.

???

>
> If you're looking for semantics on RDF graphs, I'm afraid you've got to
> turn to RDFS and OWL.

Yes. Nothing I've proposed was intended to preclude this.

I think you have misunderstood my proposal as simply using the RDF 
syntax
(which I didn't mean) rather than RDF "in its fullness" (which I did 
mean).

> Queries encoded as OWL really do have useful semantics: e.g. an 
> instance
> representing a variable which is a member of some class and inherits
> from some existential to a nominal of another instance, etc., would 
> then
> have well-defined meaning in terms of what could fill that variable.

Great.

>
> No semantics for RDF means no advantage to using it.

It puzzles me that you would think I was proposing using RDF syntax
without semantics since I explicitly mentioned the relevance of the
RDF MT...

???


>
>> 2. Clients submitting DAWG queries and processing DAWG results will
>> (almost without exception) be capable of constructing, manipulating,
>> and utilizing RDF graphs -- typically via a software API. By encoding
>> queries (and results, BTW) as RDF, this results in reduced
>> implementational
>> requirements for DAWG clients, as well as a more consistent solution
>> environment (i.e. fewer parsers, serializations, APIs, logical
>> operations,
>> etc.).
>
> Whoah- giant giant leap here.

I have a feeling that the only giant leap is your misunderstanding
of what I am saying here. Sorry. Please try re-reading it. Maybe
it will be clearer the second time around.

> Just how do you see people getting this abstract "RDF processing"
> functionality?

Eh? Abstract? I tend to find software APIs to be quite "real" and
concrete tools.

> I was rather thinking that engineers armed with an RDF query language
> wouldn't need any other RDF components to get at the data they want. If
> you've got to be able to process RDF in order to use our language to
> query RDF, then what's the bloody point?

If you have to construct a query, using a software API that allows
you to work with a well-understood form of expression (an RDF graph)
is a benefit, not a burden.

Especially since you have less risk of creating a syntactically invalid
query expression since the construction of the query is via API 
functions/methods
and serialization is handled by the API -- rather than directly 
constructing
the query expression string using string operations.

You're question "what's the bloody point" seems as strange to ask in
this case as in the case of someone using a DOM API to construct an
XML instance -- and thus not having to worry about the arcana of
the XML serialization, but only the logical structure of the instance.

Sorry, but I continue to be puzzled by your comments to my posting. They
appear to be comments to some other posting, not to what I actually 
wrote.

>
>> 3. By expressing queries in RDF, the queries themselves fall within
>> the target scope of DAWG, which (barring careless syndication of
>> query graphs with other graphs) will offer alot of utility.
>
> This is the only argument I really understand, and I understand 
> interest
> in storing queries in RDF, which is the cool new way to store anything.
> However,
>
> 1) This concern is academic; in real life users are unlikely to care
> very deeply.

I think this remains to be seen. Your assertion does not make it true.

> Witness the world's complete lack of interest in XQueryX.

I don't think you can make your conclusion based on the specific
case of XQueryX. Also, I expect the key motivations for XQueryX
had far more to do with using general XML tools (other than, or
with no special focus on, XQuery) rather than querying queries.

Still, this is not the greatest benefit/argument for having queries
expressed using RDF, so even if this particular argument is discarded
that doesn't mean the proposal to express queries in RDF is without
significant merit.

>
> 2) We still don't have any use cases at all demonstrating this user
> need, and I think we should at the very least come up with a couple of
> use cases for any requirement.

This was an argument for the requirement, not a requirement itself.

>
>> 4. Clients which wish to submit a query to a DAWG repository which is
>> known to not support any inference could employ a reasoner to
>> pre-expand
>> the query into a set of alternates and submit each to the repository,
>> syndicating the results (or this could be provided by a specialized
>> proxy-based query broker). Granted, one could do this if queries were
>> expressed in other ways, but one would first have to map the query to
>> RDF, reason about it, and then remap it back to the other form. Much
>> better to use RDF from the start.
>
> I have architected both inference engines and query brokers to perform
> the kinds of transformations you describe, and I have never been at all
> tempted to use RDF internally as an intermediary.

Were your solutions intended to be completely system and implementation
independent, allowing for the interaction of arbitrary clients and
knowledge stores operating in (potentially significantely) heterogenous
environments?

I think it's good to keep in mind that just because some particular 
approach
has worked in the past for specific implementations of specific 
solutions
in a controlled environment, that does not mean such approaches will be
suitable/optimal as a global standard aimed at platform and 
implementation
independent solutions.

So just because you have never been tempted to use RDF in the past (and
I fully appreciate why you wouldn't be so tempted) that doesn't mean 
using
RDF for DAWG is not an optimal approach, given the goals typically
associated with standards (as opposed to closed systems).

> Any query broker is going to have to be able to process the query
> language, that language is going to be a lot more specific than RDF, 
> and
> you're going to need more custom logic than generic RDF processing.

Examples?

> Using RDF as syntax just means that you've got to parse RDF instead of
> parsing something else, and to be honest RDF is one of the most awkward
> languages to parse at this point in history. (Okay, Perl, but you get
> the idea...)

Firstly, I'm not proposing using RDF just as syntax. And in fact, the
serialization employed is *irrelevant* insofar as the semantics and
evaluation of the query is concerned. What matters is the query *graph*.

Secondly, there are numerous APIs available in numerous programming
languages for numerous platforms which provide standards compliant
RDF parsers, so why would you ever bother to parse the RDF yourself?

A DAWG implementor would no sooner write their own RDF parser than
and XML impementor would write their own XML parser. True, in some
special cases some folks choose to write their own parsers, but that
is definitely the rare exception rather than the rule. Sorry, but
this argument against RDF insofar as syntactic/parsing issues is
concerned just doesn't hold water. Again, the serialization is of
secondary importance. Serialization is relevant only insofar as
interchange and (for manual input) human usability is concerned.

Thirdly, DAWG implementations are going to have to be RDF savvy
in the first place -- since, er, the target is RDF and if they
don't understand the nature of the target how will they be able
to interpret the results? -- so having queries (and results) also
expressed in RDF constitutes a *shallower* learning curve for
DAWG developers since once they grok the RDF graph and how
RDF/OWL vocabularies/ontologies work, they then grok how to
construct queries and process query results.

Having a non-RDF form of expression for queries (or results) forces
them to both understand and find/create methods to construct (or
process) those other forms of expression -- thus increasing both
the learning curve and the implementational burden (i.e. more
parsers/APIs).

>
>> 5. Since RDF already provides for arbitrary datatypes, that
>> requirement
>> is automatically met by using RDF to express queries.
>
> That's not an excuse for a new requirement; it's a solution for a
> different requirement. Don't confuse the two.

If the satisfaction of one requirement automatically results in the
satisfaction of another requirement, then that is most certainly
relevant when considering both requirements, particularly if the
other requirement has already been accepted and must be satisfied.

No, by itself alone, it doesn't justify acceptance of the requirement,
yet it's not offered by itself alone, but along with a number of other
(stronger) arguements in favor of the requirement in question.

>
>> Some use cases (already submitted):
>>
>> http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar
>> /0056.html
>
> The words "let us assume that the input query is expressed in RDF" 
> don't
> make this use case demonstrate the requirement. Again, query brokering
> can be performed quite independently of RDF. I can meet this use case
> without encoding queries as RDF.

I think that all of the use cases could be met in some other fashion 
than
the approach suggested/understood as optimal, as illustrated in the 
given
use case.

The point is not whether one couldn't do it differently, but whether 
doing
it differently would be a *better* way of doing it.

I think the use cases clearly illustrate the utility of having the 
queries
(and query results) expressed in RDF.

>
>> http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar
>> /0073.html
>
> Again, I don't see how this demonstrates any requirement for queries in
> RDF. Someone wants to know about people between 16 and 18 years of age.
> There are a ridiculous number of ways to meet this use case which don't
> encode queries in RDF.

As above.

>
>> http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar
>> /0074.html
>
> This use case certainly seems to dictate that queries are RDF, but
> that's the premise of the use case. It's like writing a use case saying
> "a user writes a query in natural language French and gets some 
> answers"
> and then saying "see? we need a natural language processor for French!"
>
> I actually don't really consider this a use case at all; it's just a
> little proto-requirement. There's never any mention of just what the
> user is trying to do; just the presumption that the query language will
> work in a particular way. We might as well have use cases along the
> lines of "a user writes a query in natural language French and gets the
> answers she wants".
>
> A use case is a real problem a user has *now*, in the absence of a 
> query
> language, that a query language could help solve. No user is using an
> RDF API to write a query in our language, because there is no language.

Gee. And here I thought we were creating "stories", not legal briefs...

Perhaps you would like to present a use case which clearly demonstrates
the requirement that queries *not* be expressed in RDF, or that queries
*must* be expressed in some Squish-like language? I think you will find
it just as difficult, or impossible, as it is to create a use case that
clearly shows that queries *must* be written in RDF. You are expecting
something that cannot be done with a (typical) use case.

Use cases reflect *optimal* ways of doing things, and requirements
that fall out of those use cases are intended to achieve *optimal* 
solutions
which allow people to do things in that *optimal* manner.

The use cases I have presented IMO reflect the optimality of using RDF 
to
express queries (and results) in RDF.

Feel free to submit alternate use cases which reflect some other 
optimality.

Patrick


>
>

--

Patrick Stickler
Nokia, Finland
patrick.stickler@nokia.com
Received on Tuesday, 6 April 2004 03:04:35 UTC