- From: Richard Cyganiak <richard@cyganiak.de>
- Date: Wed, 1 Jun 2011 18:00:16 +0100
- To: Eric Prud'hommeaux <eric@w3.org>
- Cc: David McNeil <dmcneil@revelytix.com>, W3C RDB2RDF <public-rdb2rdf-wg@w3.org>
Eric, An R2RML implementation needs to connect to a database to run queries. It cannot connect to a database unless it already knows what kind of RDBMS it is. All the examples you give below can be solved by simply using that knowledge to choose the appropriate query parser or build the appropriate query. Identifying the dialect in the mapping document is not necessary for any of these examples. Best, Richard On 1 Jun 2011, at 17:11, Eric Prud'hommeaux wrote: > * Richard Cyganiak <richard@cyganiak.de> [2011-05-31 21:57+0100] >> On 31 May 2011, at 20:46, David McNeil wrote: >>> Although my fear is that most real-world mapping will have some vendor specific SQL in them >> >> I absolutely agree. >> >>> and therefore most real-world mappings will have "no particular behavior" defined. >> >> Well, I see what you mean, but I don't see a good alternative. >> >> I wouldn't want to specify the behaviour of an R2RML engine where all SQL-carrying properties (rr:SQLQuery, rr:table, rr:tableOwner, rr:column, rr:template and so on) can contain opaque blobs. If anyone asked us to nail down the spec, we'd quickly have to give up. >> >> Implementer: “Ok, I'm trying to be conformant. So, what is the semantics of rr:SQLQuery if it's flagged as vendor-specific?” >> >> Me: “You treat it as opaque and pass it to the RDBMS you're connected to. We assume that you get back a tabular result set.” > > I have to disagree at this early stage. I can see a bunch of ways that > knowing which language appeared in the ""s will improve performance > and user experience: > > R2RML: > _:someTriplesMap > rr:languageId <http://oracle.example/OracleSQL10> ; > rr:SQLQuery """SELECT…('http://example.com/emp/job/'|| "job") AS "jobTypeURI"…""" . > > SPARQL Query: > SELECT ?job WHERE { ?emp <EMP#jobtype> ?jt } LIMIT 5 > > A naive implementation will locally materialize the entire table and > apply the limit to the results. The more intimate implementation will > push the limit down: > > Oracle SQL Query: > SELECT ('http://example.com/emp/job/'|| "job") AS "jt" > WHERE row_num <= 5 > > The same program may be designed to work with multiple databases, > necessarily generating a different queries to handle vender differences: > > R2RML: > _:someTriplesMap > rr:languageId <http://oracle.example/MySQL6> ; > rr:SQLQuery """SELECT…('http://example.com/emp/job/'|| "job") AS "jobTypeURI"…""" . > > MySQL SQL Query: > SELECT CONCAT('http://example.com/emp/job/', "job") AS "jt" > LIMIT 5 > > > The above example avoids shipping a full column of the table. We can > also avoid table scans in the case of reversable functions over > fields. > > SPARQL Query: > SELECT ?emp WHERE { ?emp <EMP#jobtype> <http://example.com/emp/job/CLERK> } > > If we correlate the functions in the SQL with their inverses, we can > trivially make use of indexes. In this case, the inverse maps the > substring from 27 to the end of the string, i.e. "CLERK", to job: > > SQL Query: > SELECT CONCAT('http://example.com/emp/', "empno") AS "jt" > WHERE job = "CLERK" > > Having a language identifier also allows the R2RML tool to parse and > locally execute the rr:inverseExpression. > > >> Implementer: “What about rr:column?” >> >> Me: “It still refers to a column name in the result, like before.” >> >> Implementer: “What characters can rr:column contain?” >> >> Me: “That depends on the SQL dialect. It must be a column that occurs in the query result.” >> >> Implementer: “Is it case-sensitive?” >> >> Me: “Again, that may depend on the SQL dialect.” >> >> Implementer: “So, if the query result has a column with different case, is that an error, or do I use that column's value to generate a triple?” >> >> Me: “It depends on the dialect.” >> >> Implementer: “But I have to implement that one way or the other.” >> >> Me: “You'll have to check the documentation of the RDBMS engine you're connected to, and implement it accordingly. That's the price you pay for supporting other SQL dialects.” >> >> Implementer: “So you mean your so-called specification does not specify it?” >> >> Me: “... sigh ...” >> >> >>> That seems a bit startling for a spec, but maybe I could get used to it :) >> >> The thing is, implementers will know what to do anyways, it's just impossible for us to write it down normatively in the spec. We could add some handwaving text that explains the intuition of handling vendor-specific SQL in the appropriate way, but I don't see how that's any better than just saying, “We define R2RML for SQL 2008 Core, full stop. You use another dialect, it's up to you to figure out the details.” >> >> Can you think of some words that work better for you than the proposed ones? >> >> Best, >> Richard > > -- > -ericP >
Received on Wednesday, 1 June 2011 17:00:46 UTC