W3C home > Mailing lists > Public > www-ql@w3.org > July to September 2004

Masters Thesis about XQuery vs SQL

From: Houman Khorasani <khorasani@web.de>
Date: Sat, 14 Aug 2004 17:24:04 -0500
To: <www-ql@w3.org>
Message-Id: <E1Bw6wd-0002ZX-00@smtp08.web.de>

Dear everyone,

My Master Thesis about "Performance Analysis of XQuery vs. SQL" is finished.
I thank everyone from the W3C-mailing list who has supported me.

Abstract:

Early in the XML history there were thoughts about whether XML is
sufficiently different from other data formats to require a query language
of its own, since SQL was already a very well established standard for
retrieving information from relational databases. 

But there were some differences that justified a new query language for XML
data.  Relational data is 'flat.' This means it is organized in the form of
a two dimensional array of rows and columns.  XML data is 'nested' and its
depth of nesting can be irregular.  On the one hand, relational databases
can represent nested data structures by using tables with foreign keys, but
it is still difficult to search these structures for objects at an unknown
depth of nesting.  

On the other hand, in XML it is very natural to search for objects with an
unknown position in a document.  There are other differences, which also
convinced the W3C workgroup to design a new XML Query language with a more
efficient semantic definition rather than extending a relational language.
It would be useful to focus research on programming an Analysis tool between
nested databases and relational databases to show performance and
scalability with respect to data volume and complexity for both types of
queries.

The "Group By" clause in SQL is not easily replaced by features of XQuery.
The thesis shows ways to achieve similar functionality and demonstrates that
the current XQuery implementations are either non-compliant to the XQuery
standard or much slower than SQL when performing "Group By" equivalent
requests.

The direct result of this thesis was an Open Source tool, PerfanX, and the
performance statistics gathered from a variety of queries run under that
tool.  The tool uncovered several counter-intuitive performance
implications.  The study also exposed several bugs in XQuery
implementations.  The author has reported these to the development groups.

You could download the Thesis from my web site:
http://www.khorasani.net

Or directly from here:
http://s95064020.onlinehome.us/masters/Master%20Thesis.pdf


Best Regards
Houman M. Khorasani
Received on Saturday, 14 August 2004 22:24:43 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:17:16 UTC