Re: notes from our working meeting today on query and rules stuff

This follows a discussion between Benjamin Grosof and Eric
Prud'hommeaux on the content of a paper [1], potentially for refereed
publication. I am cc'ing to www-rdf-rules and joint-committee@daml.org
to solicit input from others. In particular, I note parallel taxonomy
development by Andy Seaborne [2]. We should try to synch up at some
point; soon? or after poking around in the space for a bit?

On Thu, Mar 13, 2003 at 02:25:02PM -0500, Benjamin Grosof wrote:
> % notes on RDF Query vs. Rules with Eric Prud'Hommeaux 3/13/03

in following reply, "..." means not yet integrated. the "..." may be
followed by a {name} for grouping the issue.

> agenda:  
> 
> refine EricP's doc "RDF Query and Rules Status",
> esp. to use more standard KR/Querying/Rules vocabulary and concepts 
> then taxonomize the existing RDF Query systems and extract additional
> requirements and issues as we go along
> 
> start with "characteristics"
> 
> "language" characteristics:  we're renaming to be "message" characteristics
> within the language

done

> "once", "query", ... :  rename as "hypotheticals", i.e., 
> have a concept of a querying session which may extend over several queries
> and/or message exchanges, where asserted hypothetical facts or 
> query-definitions/views or rules will be only perishably kept in the
> source's knowledgebase, i.e., only for the duration of the session

kept "once" and "query" as forms of "hypothetical". Let me know if the
text is sufficient.

  #mesgChar_scope_hypothetical

> a new issue:  error checking:  need an exception tree similar to Java,
> with standardized vocabulary and message types
> o expressiveness of query exceeds what the source can handle

changed "expressiveness". the fault arises when both the
expressiveness of the language exceeds the capabilities of the service
and the client uses some feature of that extra expressiveness, but i
think involving the expressiveness leads the reader to think that it
was the expressiveness of the lang that caused the fault, not the
client's use thereof.

  #mesgChar_exceedSrvcCapabilites

> - detailed location and construct and explanation details

... {faultExplaination}
not sure how much detail to go into in what goes into a fault.
excessive structure definition may make it hard to describe
existing systems.

> - want this to rely on meta-knowledge about expressiveness lattice of
> query / knowledge-base sublanguages, probably best to do similarly to RuleML

... {*log expressivity}
Benjamin, I believe I've seen the datalog..prolog... with/without URIs
chart in RuleML slides. Do you have a pointer?

> o timeout 
> - user limit

  #mesgChar_error_userTimeout

> . possibly with explanation of how to raise it

...

> - service limit
> . always this impatient, vs. 
> . try again, I'm particularly busy now

  #mesgChar_error_systemTimeout

> - network problems apparent
> - source having its own internal problems, try again another time
> . e.g., other sources it uses

...

> new issue/principle:  meta-knowledge and descriptions in source and query
> of expressiveness,
  #mesgChar_srvcCapabilites
>                    completeness, soundness,

I made these implementation characteristics.
  #soundness
  #completeness
Are there ways that the language limits these beyond those described in
  #mesgChar_srvcCapabilites

>                                             resource-bounds/scale-limits,
> other characteristics 

...
is that size of integers and that sort of thing?

> - need standard conceptualization then automated ontology and 
> messages/portions for these immediately above
> - can be inspired by RuleML approach in this regard

... {*log expressivity}

> expressiveness of querying:
> - via hypotheticals can define views/queries/sub-queries

@@@

> - hypotheticals enable one to boost effective expressiveness of querying,
> e.g., from single-atom to conjunction, or from those to querying a 
> universal implication (rule) (via skolemization)

... {hypoBoost}

not sure this needs mentioning. It is an interesting fact, but may not
be relevent as all the langs that allow conjunctive rule bodies also
allow conjunctive queries.

> design philosophy:  
> talk about expressiveness of the source and of the query, then match
> as a first step in the querying session and indeed selection of the source
> as well as formulation of the query

...

Is this aimed at services that use a language that express some
queries/rules that it can't execute? If so, I think
  #mesgChar_srvcCapabilites

  #mesgChar_error_exceedSrvcCapabilites
    addresses this by listing it as a potential error.

> streaming characteristics:  at querying session, it it one-shot,

... {streaming}
  #mesgChar_scope_durable
describes the opposite of this. Editorial work required to express
that the hypothetical branch
  #mesgChar_scope_hypothetical
without any assertions is the standard query case (if it is).

> a max number of query answers,

put into context of a table answer
  #langChar_numRows

>                                be ready to give more answers, ...;

  #langChar_cursor

> or forward inferencing with notification upon incremental forward 
> inferencing triggered by updates at source, subscription and standing
> queries, 

i believe this is in, or should go in, the subcatagories of
  #mesgChar_scope

> 
> EricP first stab at characteristics:    
> 
> o language
> o match
> o variable
> o binding
> o API
> 
> now we're thinking:
> 
> o session 
> - one-shot vs. more extended session

  #mesgChar_scope

> - kinds of messages that must/can exchange, e.g., incl. 

... didn't get this. is this like SOAP manditory headers?

> . error checking,
  #mesgChar_error 
>                   streaming,
... {streaming}
>                              explanations,
... {faultExplaination}
>                                            hypotheticals
  #mesgChar_scope_hypothetical

> o expressiveness meta-knowledge
> - expressiveness of source KR
> - expressiveness of what queries source can handle

... {sourceKRexpr}

is there a difference between these two? ie, for our purposes, don't
we define the expressiveness of source KR by which queries it can
handle?

these are described in terms of the requestor demands
  #mesgChar_srvcCapabilites
and what the implementation promises
  #implChar

> - expressiveness of the query
  #mesgChar_srvcCapabilites
> - completeness vs. soundness
  #soundness
  #completeness

> o error checking capabilities, 
> - expressiveness problems:  relative to expressiveness of 
>     query and of source
  #mesgChar_srvcCapabilites
  #implChar
  #mesgChar_error_exceedSrvcCapabilites
> - resource problems in computational cycles or storage
> - other, e.g., network

...

> o streaming mechanics
> - max number of queries vs. no limit

  #max_queries

> - suggest to source to keep intermediate results/work

...

We may wish to describe whether a service has the ability to
communicate this notion. We aren't, technically speaking, designing
the protocol, but instead characterizing the existing protocols and
implementations.

> o hypotheticals
> - expressiveness

  #mesgChar_scope_hypothetical

> - request to assert

...
related to {hypoBoost} ? 

> o proof and explanations: source capabilities, querier requests
> - source identification
> - ditto for any delegated or imported sources
> - derived vs. directly premised/asserted

  #proofs

> under expressiveness of the query:
> 
> we can discuss in terms of issues of:
> 
> o goal expression (alias "match expression")

  #langChar
  #goalChar

> - single arc/atom vs. subgraph/conjunction
> - ground vs. open (here "open" means with variables)

  #graphOrAtom
Currently punting this domain of query langs with
[[
At this point, all single arc query languages are outside the scope of
this survey.
]]

> - can arc-label/predicate be a variable

  #goalChar_variblePredicate
incidentally, i think this is also a soundness dimension for
implementations
  #soundness
and, consequentially, as a message processing requirement
  #mesgChar_srvcCapabilites

> - explicit variable names vs. not (in which case are implicit and distinctly
> named so that cannot join(/match) on them (to each other)
> . ex. of subtlety:  tell me if there exists someone who is their own lawyer,
> but don't return a list of bindings just tell me yes or no; this is hard
> to represent in a query language that uses the same symbol 
> (e.g., "NULL" or "?blank")

  #graphOrAtom

> for all anonymous-upon-return variables
> - disjunction (i.e., disjunctive collection of subgraphs)
> . enum -ish
> . more general, e.g., arbitrary and nested
> . ex.:  (x member_of MIT) and (x has_attraction {smooth | goodlooking})
> - existential quantifiers
> - universal quantifiers; vs. not
> - implications (one- or bi- directional) -- often one can do this only via
> hypotheticals, e.g., (forall x. friendof(Eric,x) => livesin(x,USA) )
> - variable binding 
> . must-bind vs. may-bind vs. don't-bind vs. don't report 
> roughly cf. DQL; this is related to 
> existential quantification
> 
> wrt what is binding:  
> actually:  
> . binding of single var, 
> . binding tuple = binding of (conjunct of) tuple(/collection) of var's
> . binding tuple list = list of such binding tuples

... @@@ meeting with benjamin now. will finish later.

> observation [EricP]:  the current RDF Query languages seem to cluster such
> that some of the above dimensions are correlated, e.g., the languages
> that do not permit existential quantifiers also do not even permit distinct 
> variable names

...
probably covered in spirit in
  #graphOrAtom
not sure of meaning of example. what would 

> issue:  outer join

  #goalChar_outer

> open problem:
> need to avoid circularity of dependence of sites in delegated/sub- querying, 
> despite opaqueness of these sites as services,
> e.g., in semantics of situated LP
> - "querylock"
> 
> maybe SOAP people have dealt with this in a way that will help us,
> e.g., enveloping and failures (and ? circularity)

research indicate they have not (research being asking Yves Lafon,
XMLP team contact). it is the sort of thing that an upcoming
orchestration group is likely to work on.

i'll look around some more.

[1] http://www.w3.org/2001/11/13-RDF-Query-Rules/
[2] http://www.w3.org/2003/03/rdfqr-tests/recording-query-results.html
-- 
-eric

office: +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell:   +1.857.222.5741

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

Received on Tuesday, 18 March 2003 12:59:50 UTC