Re: rdfs rdb vocabulary (again) from Ivan Mikhailov on 2008-05-27 (public-xg-rdb2rdf@w3.org from May 2008)

From: Ivan Mikhailov <imikhailov@openlinksw.com>
Date: Tue, 27 May 2008 12:43:34 +0700
To: Richard Cyganiak <richard@cyganiak.de>
Cc: public-xg-rdb2rdf <public-xg-rdb2rdf@w3.org>
Message-Id: <1211867014.13676.286.camel@master.iv.dev.null>

Hello Richard,


> 1) the validity and usefulness of the schema itself (or any such 
schema);
> I would insist that such a schema should be modelled after the SQL  
> standard rather than the relational algebra.

I totally agree with you. There are too much things that are interesting
for mapping but are out of the scope of relational algebra. You've
mentioned some options of columns, I'd extend the list with
enumerations, subtables, user-defined types, free-text indexing, access
control, procedure views. Nevertheless all that data are useless without
estimated or real statistics for cost model. If the SPARQL processor is
built into the RDBMS in question then there's no need in schema that
just duplicates data from system tables. If the SPARQL processor is in a
separate process and the RDBMS access is limited then SQL code
generation requires cost data.


> 2) possible use in automated SQL-SPARQL rewriting (2-way);
> 
> SPARQL-to-SQL over such a schema is straightforward, although I don't  
> believe this is a useful way to access relational data.

The schema without detailed description of mappings is useless.
We know two things from real use cases.
1. Straightforward mappings are no more than initial drafts of useful
mappings. Even if initial intention is "do something", the very next
wish is to map to FOAF, SIOC etc., because SPARQL is for data
aggregation.
2. It's impossible to compile a non-trivial SPARQL query into efficient
SQL without paying attention to every natural restriction of the
mapping.

> 3) possible use as an intermediate format between RDB and domain-specific ontologies.

Mappings can be stored as some RDF graph, but they should not be
designed in such a way. I'm absolutely sure that there should be a
human-friendly language for them. We do not create tables or stored
procedures by inserting rows into SYS_TABLES and SYS_PROCEDURES, we
manipulate via language. The language will let us check the consistency
of user's input and provide meaningful error diagnostics.

The storage for the mapping doesn't really matter, it may be
processor-specific at all.
The high-level language is much more important.

> Well, you could use it as an intermediate format, and use a rules  
> language to map into a domain-specific ontology.
> 
> This requires that:
> 
> a) there is a rules language sufficiently expressive to deal with real- 
> world mapping scenarios;

Like one in Virtuoso ;)

>  and that
> b) evaluation of these rules can be efficiently pushed down into the  
> database engine as part of the query execution.

Indeed

>  Another question is  
> wether
> 
> c) it's preferable to develop mappings in a rules language, or in a  
> custom database-oriented language.

I insist on rules language that is high-level enough and rich enough to
provide both interoperability and whole amount of data that are required
for optimization (both cost model data and all known natural
restrictions).

Best Regards,

Ivan Mikhailov
OpenLink Software
http://virtuoso.openlinksw.com

Received on Tuesday, 27 May 2008 05:48:03 UTC