- From: Guido Moerkotte <moerkotte@informatik.uni-mannheim.de>
- Date: Tue, 14 Oct 2003 12:49:53 +0200
- To: public-qt-comments@w3.org
- Cc: mrys@microsoft.com, moer@pi3.informatik.uni-mannheim.de
- Message-Id: <200310141249.58637.moerkotte@informatik.uni-mannheim.de>
Hello, at VLDB in Berlin, Michael Rys convinced me to send another email to the XQuery comment list. Since I don't expect my email to have any influence, I don't bother writing down all points of XQuery which I think should be corrected. Instead I concentrate on the points I think are really bad. Here they are: 1) Runtime Exceptions: A query language should not have runtime exceptions. This may not always be achievable but at least type-errors should all be discovered at compilation time. This is not true for XQuery. 2) A query language should be deterministic. This is not true for XQuery. Essentially, these are the points why I write this email. Here is my motivating scenario: In XQuery, "p and q" and "q and p" may give a different result due to runtime exceptions. Why is this bad? Assume the following scenario: A big company developes a web-site based on millions of XML documents. In all these documents XQueries are embedded to give dynamic up-to-date information. Millions of the queries embedded in the documents use conjunctions and after thorough testing all queries work and the web site is put into operation. After several years, the XML document base grows and queries slow down. The sysadmin decides to gather new statistics and let the queries be reoptimized. The query optimizer decides that changing the order of conjunctions will result in better plans for about half a million queries. Unfortunately, due to the indeterminism, all queries crash. The web site is down for about three month, costing the company millions of dollars(!). What could be your answer? -- The programmers should have used cascading "if-then-else" expressions. NO! for two reasons: 1) It could well be that they are not aware of the complications involved with using "and" and "or" and runtime exceptions. (Remember that testing went all o.k.) 2) Using cascading "if-then-else" averts any query optimization. The query is not declarative any more. Now, what should be done to correct XQuery? Many things: 1) introduce NULL-values and three valued logic (Remember that OQL had the same design flaw as XQuery---although not that bad---and that they introduced NULL-values in later versions after having tried to correct things by introducing "andthen" and "orelse" (similar to "if-then-else").) 2) Don't let empty sequences partially play the role of NULL-values. (Remember: is_null(empty-sequence) is not true is_empty(NULL) is not true) These things are too different to be identified. 3) Do not identify single items with singleton-sequences that contain that single item. Even in the most flexible type systems of real and used programming/query languages they are distinguished. Other points I don't like are: 1) too much implicit casting 2) no explicit grouping (grouping has to be expressed by nested queries. these are difficult to unnest. unnesting is not always possible and is an error-prone process due to its complexity.) This is also a mistake that was made by the OQL designers. (Not exactly the same, since they have an explicit grouping, but a nested query had to be used to work on the "partition" attribute. They subsequently corrected things half-way by introducing some syntactic sugar for common cases. But you wouldn't call that a perfect solution nor would you call it elegant.) best guido ps: although XQuery improved, some of the points of my first email are still valid.
Received on Tuesday, 14 October 2003 06:53:51 UTC