Intro, Ivan Mikhailov from imikhailov on 2007-01-22 (public-rdf-dawg@w3.org from January to March 2007)

From: imikhailov <imikhailov@openlinksw.com>
Date: Mon, 22 Jan 2007 19:19:25 +0600
To: <public-rdf-dawg@w3.org>
Message-ID: <E1H8z5B-0008Kd-3O@maggie.w3.org>
Hello,

I'm Ivan Mikhailov, developer of RDF and SPARQL functionality of OpenLink
Virtuoso.

During last 15 years I write various parsers, interpreters and compilers.
That's probably my fate. I participated in CAD, AI and DBMS projects that
have nothing common between each other; but every project required at least
one translator of some sort.

I wrote two optimising compilers for REFAL-2 (with muLISP- and C- based
runtime engines), a compiler from annotated C++ to annotated subset of C
that was a front-end part of C++ verification project SPECTRUM, a translator
and a runtime for a dialect of JavaScript with RDBMS-specific extensions,
and few smaller things.

When I joined OpenLink, I implemented XML parser/validator, most of XPATH,
XQuery and XSLT processors with relational-to-XML data mapping, part of
free-text search, SPARQL to SQL front end compiler and relational-to-RDF
data mapping.

I'll try to keep SPARQL
1) convenient for RDBMS-based implementations
2) attractive for small start-ups
3) protected from incompatibilities between versions.

1.
I'm sure that the most important advantage of SPARQL is that it can be used
as a front-end for an efficient relational query language. RDF is good
because it reflects the nature of human's memory. RDF describes named things
that have some types and some named properties and whole amount of knowledge
tend to form a big nonuniform graph. On the other hand relational algebra
offers fundamentally faster algorithms for uniform parts of this knowledge.
No matter how we optimize graph representation, stable four-tape sort is
faster than topological sort, binary search is faster than skip-lists etc.,
so big systems must use relational data representations and a thing that is
visible as an uniform RDF storage should consist of a set of relational
storages and a graph that consists of all 'irregular' objects and
properties.

2.
SQL eliminated the need of patching numerous data access procedures after
any minor change in the low-level structure of data storage. Similarly,
SPARQL should eliminate the need of patching texts of numerous SQL queries
after any minor change in the schema of the storage. This cuts related costs
and makes SPARQL attractive for business.

RDF stimulates developers to remember graph algorithms and classic AI data
structures. SPARQL may become a bridge between DATALOG technologies for
relational data and graph-based inference rules, and RDF+SPARQL will provide
strong industrial support for AI research. Reusing the widespread DBMS
infrastructure for AI projects will let developers focus on AI, not on data
access routines; this makes SPARQL attractive for scientists.

My objective is to try keep SPARQL attractive for _both_ scientific and
business applications. The reason is that almost any start-up requires one
man with an idea and one man with money. The technology we develop must
attract both in order to provide growing number of interesting applications.

3.
SPARQL should be prepared for long lifetime. SPARQL services should be
self-documented, SPARQL query language should provide a way of declaring
application- and implementation-specific pragmas, macro expansions and
environment variables, so SPARQL clients and services should stay
interoperable to the maximum possible extent. SPARQL should be convenient
for inlining into other languages, such as SQL and JavaScript, but it should
not be tightly coupled with any implementation language, e.g., Java, because
languages become more and less popular while SPARQL should stay usable.

Best Regards,
IvAn Mikhailov
Received on Monday, 22 January 2007 13:19:38 UTC