RE: list change for discussion of "RDF query for SQL data"

Eric - cool.

One question: where does the "productName_name" in the SQL query come from?
Greping through [3] I can't see where it decides the term to use in the AS
clause.

Do you introspect on the database to get the types of the columns in a
table? Or do you have to hint the SQL production in some way?  Or have an
implicit transformation from a term like ":p Products:name :productName ."
gives productName_name.  Or something else?

	Andy

> -----Original Message-----
> From: Eric Prud'hommeaux [mailto:eric@w3.org] 
> Sent: 20 January 2003 17:51
> To: www-rdf-rules@w3.org
> Cc: www-rdf-interest@w3.org
> Subject: list change for discussion of "RDF query for SQL data"
> 
> 
> 
> I originally sent this to www-rdf-interest. I reallized after
> posting the message that it would be more appropriate on
> www-rdf-rules. www-rdf-interest folks, please post followups
> to www-rdf-rules.
> 
> 
> algae [1] and cwm [2] now both have the ability to query a relational
> database with an application-specific schema. An example OrderTracking
> database is queryable via MySQL at swada.w3.org with the username
> rdftest and no password. The n3 (see example [3])
> 
>  :o Orders:customer :c .
>  :o Orders:product :p .
>  :o Orders:orderDate :d .
> 
>  :p Products:name :productName .
> 
>  :c Customers:familyName :first .
>  :c Customers:givenName :last .
>  :c Customers:billingAddress :billAddr .
> 
>  :billAddr Addresses:street :billStreet .
>  :billAddr Addresses:city :billCity .
>  :billAddr Addresses:state :billState .
> 
> is translated into an SQL query
>  SELECT Products_0.id AS p_id,
>           Products_0.name AS productName_name,
>         Addresses_0.id AS billAddr_id,
>           Addresses_0.city AS billCity_city,
>           Addresses_0.state AS billState_state,
>           Addresses_0.street AS billStreet_street,
>         Customers_0.id AS c_id,
>           Customers_0.familyName AS first_familyName,
>           Customers_0.givenName AS last_givenName,
>         Orders_0.id AS o_id,
>           Orders_0.orderDate AS d_orderDate
>  FROM Products AS Products_0, Addresses AS Addresses_0, 
> Customers AS Customers_0, Orders AS Orders_0
>  WHERE Customers_0.billingAddress=Addresses_0.id
>    AND Orders_0.customer=Customers_0.id
>    AND Orders_0.product=Products_0.id
> 
> and executed. The results are mapped back into RDF statements,
> available for report or furthur inferencing. The algae equivilent (see
> example [4]) is
>        (OT::Orders.customer	?o	?c)
>        (OT::Orders.product	?o	?p)
>        (OT::Orders.orderDate	?o	?d)
>        (OT::Products.name	?p	?productName)
>        (OT::Customers.givenName	?c	?first)
>        (OT::Customers.familyName	?c	?last)
>        (OT::Customers.billingAddress	?c	?billAddr)
>        (OT::Addresses.street	?billAddr ?billStreet)
>        (OT::Addresses.city	?billAddr ?billCity)
>        (OT::Addresses.state	?billAddr ?billState)
> 
> One implications of this is that an awful lot of rdb data could be
> easily made available to RDF tools. Another is that mapping such data
> to RDF can be a way to join disparate rdbs. And a third is that RDF
> data can be made to scale on the same order as rdbs, which is
> generally acknowledged as very well.
> 
> Anyways, it would be fun to get some feedback if folks want to try out
> either the cwm or algae implementations. If you experiment with cwm,
> you will need to install mysql drivers for python (debian package
> python2.2-mysqldb) and for algae you will need mysql drivers for perl
> (debian package libdbd-mysql-perl). The perl version has the ability
> to do ORs [5] NOTs, [5] and outer joins [6]. I haven't worked out the
> notation for that in n3 but it should be feasible to port to cwm as
> well. As an example, the algae query
>        (OT::Orders.customer		?o	?c)
>        (OT::Orders.product		?o	?p)
>        (OT::Orders.orderDate	?o	?d)
>        (OT::Products.name		?p	?productName)
>        (OT::Customers.givenName	?c	?first)
>        (OT::Customers.familyName	?c	?last)
>        (OT::Customers.billingAddress ?c	?billAddr)
>        (OT::Addresses.contact	?billAddr ?biller)
>        (OT::Customers.givenName	?biller	?bFirst)
>        (OT::Customers.familyName	?biller	?bLast)
>        ~(OT::Orders.shippingAddress	?o	?shipAddr)
>        ~(OT::Addresses.contact	?shipAddr ?signer)
>        (OT::Customers.givenName	?signer	?sFirst)
>        (OT::Customers.familyName	?signer	?sLast)
> 
> results in the SQL query
>  SELECT Orders_0.id AS o_id,
>  Orders_0.orderDate AS d_orderDate,
>         Products_0.id AS p_id,
>  Products_0.name AS productName_name,
>         Customers_0.id AS c_id,
>  Customers_0.givenName AS first_givenName,
>  Customers_0.familyName AS last_familyName,
>         Addresses_0.id AS billAddr_id,
>  Addresses_0.street AS billStreet_street,
>  Addresses_0.city AS billCity_city,
>  Addresses_0.state AS billState_state,
>         Addresses_1.id AS shipAddr_id,
>  Addresses_1.street AS shipStreet_street,
>  Addresses_1.city AS shipCity_city,
>  Addresses_1.state AS shipState_state
>  FROM Orders AS Orders_0
>       INNER JOIN Customers AS Customers_0 ON 
> Orders_0.customer=Customers_0.id
>       INNER JOIN Products AS Products_0 ON 
> Orders_0.product=Products_0.id
>       INNER JOIN Addresses AS Addresses_0 ON 
> Customers_0.billingAddress=Addresses_0.id
>       LEFT OUTER JOIN Addresses AS Addresses_1 ON 
> (Orders_0.shippingAddress=Addresses_1.id)
> 
> and the data
> +------+----------+------------+------+----------+--------+--------+
> | first|      last| productName|bFirst|     bLast|  sFirst|   sLast|
> |------|----------|------------|------|----------|--------|--------|
> |"Chip"|"Thompson"| "nose ring"|"Biff"|"Thompson"|    NULL|    NULL|
> |"Chip"|"Thompson"|"other ring"|"Biff"|"Thompson"|"Eustis"|"Walker"|
> |"Biff"|"Thompson"|      "pool"|"Biff"|"Thompson"|    NULL|    NULL|
> |"Chip"|"Thompson"|"skateboard"|"Biff"|"Thompson"|    NULL|    NULL|
> +------+----------+------------+------+----------+--------+--------+
> 
> Another interesting feature is that both can trap under-constrained
> queries and throw an exception before executing such potentially
> expensive (and likely useless) queries. They can also do an
> overconstraint check and warn the certain tables are linked to others
> in more than one way, even if it is via some number of intermediate
> tables.
> 
> Clearly the largest contribution this technology has to offer,
> however, is the cool formatting of the SQL queries.
> 
> Next on the plate are some compelling example queries joining data
> from the OrderTracking database with some other database. I considered
> a product review database, but such services are more likely to be
> available over HTTP and SQL. (cwm and algae can join against data
> obtained from HTTP, but that was not so much the point of this
> exercise.) I would be interested in suggestion, particularly ones that
> would be interesting to the web services crowd.
> 
> Let me know what you think of this. Thanks.
> 
> [1] http://www.w3.org/2000/10/swap/doc/cwm.html
> [2] http://www.w3.org/1999/02/26-modules/
> [3] http://dev.w3.org/cvsweb/2000/10/swap/test/dbork/OrderTracking1.n3
> [4] 
> http://dev.w3.org/cvsweb/perl/modules/W3C/Rdf/test/OrderTracki
ng1-alg.sh 
[5] http://dev.w3.org/cvsweb/perl/modules/W3C/Rdf/test/SqlDBtest7-alg.sh
[6] http://dev.w3.org/cvsweb/perl/modules/W3C/Rdf/test/OrderTracking5-alg.sh

-- 
-eric

office: +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell:   +1.857.222.5741

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

Received on Thursday, 23 January 2003 05:00:34 UTC