RE: Proposed XQuery requirement and/or objective from Rob Shearer on 2004-07-20 (public-rdf-dawg@w3.org from July to September 2004)

From: Rob Shearer <Rob.Shearer@networkinference.com>
Date: Mon, 19 Jul 2004 17:44:03 -0700
To: "Jim Hendler" <hendler@cs.umd.edu>, "Jeff Pollock" <Jeff.Pollock@networkinference.com>, <public-rdf-dawg@w3.org>
Message-ID: <CFE388CECDDB1E43AB1F60136BEB4973028105@rome.ad.networkinference.com>
While I completely agree that a mature strawman spec would help out a lot, my contention from the start has been that XQuery is a valid strawman, and that a few simple extensions for RDF can be defined which meet all the requirements we need.

The simplest addition is to just add an external function (http://www.w3.org/TR/xquery/#id-function-calls) which tests for the existence of an RDF triple. Within Network Inference this is called this 'related', but Simon suggested the name "asserted". A perfectly valid query asking whether Rob worked for Network Inference would be:

asserted(http://foo/Rob, http://foo#worksFor, http://foo#NetworkInference)

which would return an xs:boolean. The larger goal of using XQuery is that you could then embed this query within a larger XQuery application:

for $member in doc(http://foo/people.xml)/Person
where asserted($member/URI, http://foo#worksFor, http://foo#NetworkInference)
return $member/name

which is meant to iterate across all "Person" elements in some XML document and figure out whether they work for NI or not based on some RDF (which is assumed to be implicit in the query's context). You might even be able to encode the whole query within an XPath expression and avoid the 'for' construct; the point is that you can build very complex queries based on the features provided by the full XQuery language.

The real value comes if you also provide a function, perhaps called "nodes", which returns every URI used as a subject or object anywhere in the RDF. Now you can use XQuery FLWOR expressions to get SQL-like functionality:

<membersAndSpouses>{
for $member in nodes()
for $spouse in nodes()
where asserted($member, http://foo#memberOf, http://foo#DAWG)
   and asserted($member, http://foo#marriedTo, $spouse)
return <member>{$member}</member><spouse>{$spouse}</spouse>
}</membersAndSpice>

Which generates some spiffy XML based on RDF. Of course, since RDF has an XML serialization, you can also generate RDF as output. (Or you could extend XQuery further to return a sequence of native RDF triples instead of XML nodes.)

Now that the group has at least given a line in the sand by starting with BRQL, I'm hoping to go through that spec and map each of the constructs in that language back to fragments of XQuery, since it seems likely that BRQL is subsumed by an XQuery extended with only a few external functions.

> -----Original Message-----
> From: public-rdf-dawg-request@w3.org 
> [mailto:public-rdf-dawg-request@w3.org] On Behalf Of Jim Hendler
> Sent: Monday, July 19, 2004 4:02 PM
> To: Jeff Pollock; public-rdf-dawg@w3.org
> Subject: RE: Proposed XQuery requirement and/or objective
> 
> Jeff-
>  again, let me register my neutrality on this issue - I can 
> see many ways it could possibly go that we could both live 
> with given what we've written so far.  However, my real 
> problem with taking it further is that since Kendall has 
> walked into my office and drawn on my white board a bunch of 
> examples (from the use cases)  I have a comfort level there, 
> but I don't have examples of what you want to see what 
> mappings might look like.  Maybe it is because I joined late, 
> but I don't have such examples for the Xquery type syntax, so 
> it is hard for me to really make informed judgements -- in 
> light of your number 1 below "happy medium of Xquery 
> syntax... and simple semantics" I wonder if you and/or Rob 
> could just draft an example or two of what this might look 
> like for those of us who weren't at the f2f - I don't mean a 
> strawman proposal, just an outline of how you might do one or 
> two of the first use cases in the document, so I and the 
> others who haven't seen it can get a feel -- if you have done 
> this before or presented at the f2f, just a pointer or quick 
> recap would be good to help those of us who missed it. 
>  I'd feel a lot more comfortable exploring the below with an 
> example or two in hand
>  thanks
>  JH
> 
> At 9:36 -0700 7/19/04, Jeff Pollock wrote:
> Jim-
>  
> No, I don't think we are too far off here. If you are in 
> favor of a [surface layer/grammar/concrete syntax] for the 
> DAWG proposed query language that builds upon a 
> widely-adopted format, then we are agreed in principle.
>  
> Tactics on the other handŠ ;-)
>  
> We think XQuery is a better basis for a surface layer for 
> several reasons:
>  
> (1)     XQuery is more modular than SQL. It lends itself to a 
> richer use of its grammar and simple query semantics, without 
> adopting the data model, than does ANSI SQL. We are _not_ 
> proposing a simple "looks like XQuery" surface layer here, 
> _nor_ are we proposing a wholesale adoption of the XQuery 
> data model and XPath. Instead, we feel like a happy medium of 
> XQuery syntax (language constructs) and grammar (keywords) 
> and simple semantics (meanings to keywords) should be the 
> basis of the DAWG surface layer.  For this point, I wholly 
> defer to Rob Shearer - who is an expert on implementing 
> XQuery over inference engines and SemWeb data structures - he 
> will correct and augment my language as needed.
>  
> (2)     XQuery is a W3C spec. We believe in supporting the 
> W3C initiatives and making use of the layered architecture 
> principles advocated by this organization. Barring any 
> technical barriers that are insurmountable, we think that 
> XQuery should be the natural starting point for this surface 
> syntax. (apparently the authors of the DAWG charter did as well)
>  
> (3)     XQuery is more "general purpose" than SQL. We believe 
> that the Semantic Web (engines and language specs) will be 
> about far more than databases or data access. Our customers 
> use our XQuery-based inference engine for many non-database 
> use cases. For instance, the use of OWL/RDF for encapsulating 
> business rules than can be deduced at runtime by enterprise 
> applications. Likewise, many of our customers are using OWL 
> as a query mediation schema for heterogeneous data access to 
> web services. Neither of these cases is database-like in its 
> implementation. We foresee a future where the Semantic Web 
> does far more than provide "federated databases" or "data 
> integration" style applications. We think that business 
> rules, business logic, web services interface management, 
> process management and so forth are important aspects of the 
> long-term development of the Semantic Web vision that require 
> a more general purpose query language than SQL.
>  
> (4)     XQuery has a native XML context. Regardless of all 
> the political infighting that occurred, the output of RDF and 
> OWL (and most likely SWRL) specifications was solidly 
> grounded upon XML inside the SemWeb layer cake. As the 
> foundation of the layer cake, XML serves as a common syntax 
> for all SemWeb languages, it makes sense to ground the query 
> layer in a similar syntactic (or surface layer) basis. 
> Further, since the commitment was made to XML in this 
> capacity, we think it a natural fit to choose a unified query 
> concrete syntax that is grounded in the native data 
> representation syntax for semantic web specifications.
>  
> I feel like there are many other good supporting arguments 
> and rationale for XQuery, so I reserve the right to add to 
> this list later.  ;-)
>  
> But for now, these are some of the important reasons why 
> XQuery would be a better surface syntax than SQL for the DAWG 
> query output.
>  
> Regards,
>  
> -Jeff-
>  
>  
>  
> ________________________________
> 
> From: Jim Hendler [mailto:hendler@cs.umd.edu]
> Sent: Thursday, July 15, 2004 3:07 PM
> To: Jeff Pollock; public-rdf-dawg@w3.org
> Subject: RE: Proposed XQuery requirement and/or objective
>  
> At 14:47 -0700 7/15/04, Jeff Pollock wrote:
> Jim-
>  
> Points taken, and no hostility inferred.
>  
> Your counterpoints regarding the adoption of SQL are a great 
> debate to have.
>  
> In broad brush-strokes, we are committed to a query concrete 
> syntax which is grounded in a widely-adopted (and preferably 
> W3C recommended) representation.
>  
> Further, in no means do I intend to imply that XQuery would 
> make things easier on the vendor implementations for 
> RDFS/OWL/Rule components of the SemWeb - quite the opposite, 
> the implementations may even be more difficult.  Our point is 
> intended to speak towards our opinion that a known query 
> representation would speed user adoption rates for semantic 
> web languages.
>  
> If early adopters of large commercial organizations were 
> faced with learning and implementing a wholly new syntax for 
> queries - on top of what they already have to pay for in 
> human resource expertise - we suspect, and have encountered, 
> resistance.
>  
> Anecdotally, we would likely be supportive of the OWL "two 
> surface realizations" model, as long as one of them was a 
> widely-adopted standard format.
>  
> -Jeff-
>  
> sounds like we're near the same page -- guess what I'm having 
> trouble w/is the "widely-adopted standard format" -- since I 
> haven't seen the Xquery proposal, I've been assuming it is 
> some sort of specialization of Xquery much as RDQL is a 
> "SQL-like" langauge -- guess I'm thinking that most large 
> commercial orgs have lots of people who speak SQL and could 
> learn RDQL-like langauge without thinking of it as different 
> (I speak from experience, I've met a lot of govt folks who 
> have used RDQL with RDF DBs because "they didn't need any 
> training" - which is more or less a direct  quote from 
> someone telling me why he didn't take a SemWeb training 
> course some colleagues were teaching) where Xquery is not yet 
> on their todo list.  On the other hand, it is clear more 
> people will move to Xquery as XML DBs slowly get accepted and 
> steal market share from traditional RDBMS DBs (although right 
> now it is pretty clear which one if David and which is 
> Goliath) .. so I think I would agree with you that "as long 
> as one of them was a widely-adopted standard format", 
> although I'm less sure we would agree which is which :->
>  -JH
>  
>  
> --
> Professor James Hendler                   
> http://www.cs.umd.edu/users/hendler 
> Director, Semantic Web and Agent Technologies       301-405-2696
> Maryland Information and Network Dynamics Lab.      301-405-6707 (Fax)
> Univ of Maryland, College Park, MD 20742      240-277-3388 (Cell)
>  
> 
> 
> -- 
> Professor James Hendler                   
> http://www.cs.umd.edu/users/hendler 
> Director, Semantic Web and Agent Technologies       301-405-2696
> Maryland Information and Network Dynamics Lab.      301-405-6707 (Fax)
> Univ of Maryland, College Park, MD 20742      240-277-3388 (Cell)
>    
>
Received on Monday, 19 July 2004 20:46:25 UTC