- From: Michael Rys <mrys@microsoft.com>
- Date: Wed, 7 Feb 2001 09:13:47 -0800
- To: Michael Brundage <michaelb@microsoft.com>
- Cc: "'www-xml-query-comments@w3.org'" <www-xml-query-comments@w3.org>
Dear Michael, please receive our comments on your feedback. Feel free to
provide us with further feedback, especially if you think that a resolution
and changes incorporated into the latest draft are not sufficient and where
we kindly ask you for some more information (e.g., issue #19).
I added some remarks to Mary's remarks (Mary please note) in <MR> tags.
Best regards
Michael
-----Original Message-----
From: Mary F. Fernandez [mailto:mff@research.att.com]
Sent: Friday, January 19, 2001 8:47 AM
To: Michael Rys
Cc: xmlquery
Subject: Michael Brundage's comments on Algebra
Hi Michael,
Please find below my responses to Michael Brundage's comments.
Issues are groups by subject and resolutions are tagged <Resolution>.
My remarks are tagged with <MF>. Michael Rys' in <MR>. Please
review, annotate, and respond to Michael Brundage and cc:
w3c-xml-query-comments@w3c.org.
thx
mff
I. Issues related to lexical representation of literals:
========================================================
<Resolution>
Added [Issue-0081: Lexical representation of Schema simple types]
</Resolution>
Issue #41: no boolean literals ( Section 3.6: Expressions, paragraph
2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
high priority, core) There are no literal expressions for the
boolean constants true and false.
Issue #43: string lexical representation not defined ( Section 3.6:
Expressions, paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) The lexical representation of strings is not
defined. Please do not use the XPath definition (which is
pathetic). Use a definition that allows both single- and
double-quoted strings, and also allows escape sequences. I
implemented string := '"' (''' | escape | ~['"\])* '"' | ''' ('"' |
escape | ~['"\])* ''' escape := '\' any-character ; (the processed
escapes are: \t, \n, \r, \\, \" and \' )
Issue #44: number lexical representation not defined ( Section 3.6:
Expressions, paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) The lexical representation of numbers is not
defined. I implemented number := ('+' | '-')? digit+ ('.' digit+)?
digit := '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
Issue #45: other lexical issues ( Section 3.6: Expressions,
paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) Whitespace is not defined. I implemented whitespace
:= ' ' | '\t' | '\r' | '\n' Except for whitespace in string literals
(which is of course preserved), whitespace is completely ignored.
Issue #55: scalar literals not defined ( Section 3.6: Expressions,
paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) Lexical representations for other scalar types
(IDREFS, NMTOKENS, date, time, etc.) are not defined. How are scalar
values of these types to be represented? My personal preference is
cast 'a1 a2' as IdrefsScalar
Issue #31: Constants not defined ( Section 3.6: Expressions,
paragraph 3
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, syntax) Constants are not defined. What is the syntax of
numeric constants? String constants? Boolean constants? Others?
Also, this is somewhat odd terminology, since for example the empty
expression () is a constant.
II. Issues addressed by addition of operational semantics
=========================================================
<Resolution>
Unless otherwise noted, these issues are resolved by the
addition of Section 5 Dynamic/Operational Semantics.
</Resolution>
Issue #11: comma operator ( Section 2.1: Data and types, paragraph
10
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
high priority, core) The ampersand operator is defined, but the
comma operator is not. 2.2 Projection
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
Issue #17: for is not a forest ( Section 2.4: Iteration, paragraph 7
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) This section says that the value of a for expression
is a forest. However, section 2.1 defines a forest to be a sequence
of (zero or more) attributes and elements. Clearly the result of the
expression for b in bib0/book do b/@year/data() fails to meet this
definition. Please clarify the meaning of for.
<MR>
In my opinion, section 2.1 should defer to the datamodel
definition of a forest which can be a sequence of value nodes
</MR>
<MF>
Clarified in text and referred to data model & operational
semantics.
</MF>
Issue #26: empty() ( Section 2.6: Quantification, paragraph 5
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
normal priority, core) What is the result of empty( (), () ) ? In
general, when do operators and functions flatten or unnest their
arguments?
<MF>
Clarified in data model and in operational semantics.
</MF>
Issue #25: set operations are not defined ( Section 2.5: Selection,
paragraph 9
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) What is the result of the expression
for b in bib0 do
where b/book/@year/data()<=2000 do 1?
Define the meaning for
operators like less-than-equals on sets, forests, etc. 2.6
Quantification
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>
Defined in data model and in operational semantics.
</MF>
Issue #52: automatic type coercions are not defined ( Section 3.6:
Expressions, paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) Under what circumstances is this expression valid or
invalid?
fun some_function ( a : Type1) : Type2 =
match a case x : Type3 do x else error
<MF>
Operational semantics of match defines precisely.
</MF>
Issue #53: order of evaluation is not defined ( Section 3.6:
Expressions, paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) Order of evaluation of expressions is not defined
(this is related to operator precedence, which is also not
defined). Do and and or shortcut? Is (a, b, c) equivalent to ((a,
b), c) or (a, (b, c))?
<MF>
Defined in operational semantics.
</MF>
III. Operational semantics of match
===================================
<Resolution>
Unless otherwise noted, these issues are resolved by the
addition in Section 5 Dynamic/Operational Semantics of semantics of
match.
Added [Issue-0083: Expressive power and complexity of match
expression]
Additional comments follow.
</Resolution>
Issue #24: match semantics ( Section 2.14: Structural recursion,
paragraph 11
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
high priority, type system) How does match handle redundant cases?
The description of match seems to imply that it is equivalent to a
large if then else if then else if ... else block instead of a
switch (because it takes the first matching branch). This will be
terrible for performance. Also, the case statements appear to accept
any type, including types not known at compile time (such as type
variables or expressions involving type variables). Consequently,
match becomes a run-time concept (in which case, I have to wonder
what value it brings to your type system.) Again, this will be
absolutely terrible for performance. 2.15 Functions for all
well-formed documents
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>
The values of all type expressions are known at compile time <MR>I assume
that you mean query-analysis time.</MR>,
even the values of type variables, which are defined statically and
are immutable. The static semantics of match determines the most
precise type for the expression. The operational semantics defines
precisely the run-time semantics of match, which compares the
run-time type of the value to match and evaluates the case rules in
order: the value of the first case rule that matches is returned.
This is the simplest, *operational* semantics; it is not necessarily
the most efficient implementation of match. It is possible to
implement match very efficiently. For example, pattern matching in
ML is implemented efficiently. We can short circuit the run-time
evaluation of case rules whose static type are guaranteed not to
match the match value.
In addition, the user-level syntax in XQuery may choose to provide a
restricted form of match to the user, i.e., one that does not permit
arbitrary type expressions in the case rules.
Consider the example :
type exType = String | b [ Integer ]
let v = b [ 10 ] : exType do
match v
case v2 : String do v2
case v3 : b [ Integer ] do v3
else a [ Float ]
: String | b [ Integer ]
==> b [ 10 ]
Note that the *type* of the match expression is String | b [ Integer
],
because at *query-analysis time*, the type of v is String | b [
Integer ]
even though its value has type b [ String ].
Even though the else clause returns a
value with type a [ Float ], the static type rules eliminate this
possibility because it can never match String | b [ Integer ]. At
query evaluation time, the dynamic type of v is b [ Integer ],
therefore at run time, it matches the second case rule and returns
the value b [ 10 ].
</MF>
Issue #49: match/case is very expensive ( Section 2.15: Functions
for all well-formed documents, paragraph 12
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) The typeguard is very expensive. The fourth example
in this section, html_of_xml, says that the first branch of the
match checks whether the (run-time) value of x is a subtype of
AnyScalar, and if not then the second branch checks to see if x is a
subtype of AnyAttribute, and so on. In the worst case, this kind of
type-checking requires traversing a structure as large as the
instance of x. Consider for example:
fun not_fun ( x : AnyComplex ) : Integer =
match x
case t1 : AnyTree do 1
case t2 : AnyTree{0, 2} do 2
case t3 : * [ * [ * [ * [ * [ * [ AnyAttribute ] ] ] ] ] ] do 3
else error 3
Algebra Components
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
3.4 Types : abstract syntax
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>
Regarding "In the worst case, this kind of
type-checking requires traversing a structure as large as the
instance of x."
No, checking type subsumption and intersection requires traversing a
structure as large as the *type* of x, not of the instance of x
itself. All values carry a representation of their type. Call the
runtime type of x, T. In the above example, there will be three type
tests: (1) T /\ AnyTree != 0 (and if that fails then...) (2) T /\
AnyTree{0,2} != 0 (and if that fails then...) (3) T /\ * [ * [ * [
* [ * [ * [ AnyAttribute ] ] ] ] ] ].
The operational semantics of match should make this clear.
<MR>I think Michael's worst case is based on the fact that you may not
know the type of x until you have traversed the instance (e.g., in streaming
applications). Michael, is this correct?</MR>
</MF>
IV. Issues addressed by definition of operators by joint
XSLT/Schema/Query task force on operators
=========================================================
<Resolution>
The operational semantics defines the relationship of the Algebra to
the Data model. Other open issues are cleared identified in
Introduction:
[Issue-0018: Align algebra types with schema]: The Algebra's type
system is not yet aligned with XML Schema.
[Issue-0056: Operators on Simple Types]: A joint XSLT/Schema/Query
task force is chartered to define the operators on Schema simple types.
The Algebra will adopt the operators defined by that group.
[Issue-0081: Lexical representation of Schema simple types]: The
lexical representation of Schema simple types must be defined for the
Algebra and XQuery.
</Resolution>
General comments: The currently proposed XML Query Algebra fails to meet
even its own requirements; specifically, it does not use the XML Query
Data
Model or XSD type system. Essential language issues such as operator
semantics, operator precedence and variable scope are entirely ignored.
V. Issues related to alignment of XML Query Algebra's type system
with XML Schema
===================================================================
<Resolution>
Alignment with Schema is open issue:
[Issue-0018: Align algebra types with schema]: The Algebra's type
system is not yet aligned with XML Schema.
</Resolution>
Issue #1: type system ( Section 2.1: Data and types, paragraph 6
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, type system) The XML Query Algebra does not use the XSD
type system, but instead uses its own homegrown type system (derived
from the XDuce research project). This violates the requirements of
the XML Query Language. Based on my experience with prototype
implementations of the algebra, I claim that this type system is not
necessary for an efficient implementation of the
algebra. Furthermore, I will now provide several demonstrations that
this type system is both ill-defined and incompatible with the type
system of XSD. In section 2.4, an example is given that constructs
a new book element with different content than the book type already
present in the type system. Because the XML Query type system
differentiates types by name, a query cannot simultaneously use both
book types. Moreover, there are issues in serializing out the
results of such a query with its schema, because of the name
collision. (You could try to work around this problem by
auto-generating unique type names in the schema, but then you will
not be able to roundtrip. You could work around this problem by
creating a namespace for XML Query annotations in schemas, and
adding annotations to the appinfo element in the XSD (or possibly
attributes on the types themselves) that describes how to restore
the original type names. But yuck, this is a terrible solution.)
Also, not all types have a single name. Consider the simple example:
for b in bib0/book do
b/author, b/title
This algebraic expression is certainly well-formed, returning an
expression with type author [ String ], title [ String ] . However,
because this is a sequence type, it has no single (top-level
element) name. Suppose we assign this expression to the variable
v. Then what is the result of
match v
case v1 : * do "case 1"
case v2 : *, * do "case 2"
case v3 : *, *, * do "case 3"
case v4 : author [ String ], title [ String ]
do "case 4"
else error
Does the first case match? Or the second? Or does only the fourth
case match? (to be completed)
If you're going to use the XSD type system, use it exclusively. If
the XSD type system isn't up to the challenge, get the schema
working group to fix it while you still can. And finally, if all
else fails, add or change only those things absolutely necessary to
implement the XML Query Language. Don't create yet another type
system that implementors must support.
<MR>
Our proposal is to use the XSD type system as implied by XSD and
cooperate on the formalisation based on MSL.We see the type-related
comments in this feedback also as potential input to the type
formalisation work.
</MR>
Issue #23: so how do types work, really? ( Section 2.4: Iteration,
paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
high priority, type system) The return type of the first iteration
is fully expanded into book[ author [ String ] { 1, *}, title [
String ] ]{0, *} but in section 2.5 the return type is collapsed
into Book{0,*}. How does the algebra system know which return type
to use? It seems to me that in every heuristic imaginable, Book{0,*}
is in-between the fully-expanded type and UrTree. Your response is
probably "it's Book because we declared it that way," in which case
my follow-up question goes something like this: Consider a
semi-redundant type definition such as type
Normal = (a [String ] | String | UrElement | UrTree) {0, *}
and an instance with this type
let n : Normal = anything, it doesn't matter. Then what is the
compile-time type of the expression
match n
case s : UrScalar do s
case n : UrElement do n/*
else n?
If this example is not quite
legal or is too trivial, tell me and I will come up with a better
example that still demonstrates the same point. 2.5 Selection
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
VI. Miscellaneous editorial issues
===================================
0 Status
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
Issue #3: Redmond, not Redmont ( Section 0: Status, paragraph 7
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
low priority, editorial) Redmond is misspelled Issue #4: accents (
<MF>Fixed.</MF>
Section 0: Status, paragraph 11
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
low priority, editorial) These names are misspelled; they should
contain accented characters (Fernandez, Simeon).
<MF>Fixed.</MF>
Issue #13: compile-time vs. run-time ( Section 2.2: Projection,
paragraph 8
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) You cannot use the words compile-time and run-time
(which have important significance for implementations) without
first defining them. Furthermore, this is a different description
than given previously for duplicates (where the distinction was made
between the type of the expression, and the type of its value --
which I guess you're trying to say are compile-time and run-time,
respectively?). Also, the introduction of the notions compile-time
and run-time has the appearance of constraining compliant
implementations to compile first and then run, instead of executing
XML queries in an interpreted way.
<MF>
Changed "compile-time" to "query-analysis time" and "run-time" to
"query-evaluation time" with clarification in text. Also added
[Issue-0084: Execution model].
</MF>
Issue #2: meaning of duplicate ( Section 2.2: Projection, paragraph
3
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
high priority, editorial) I strongly object to the use of the word
"duplicate" in this sentence. Duplicate has a precise meaning in
XPath, which is different from its use here. An example of duplicate
elements is the expression bib0/book/author, bib0/book/author which
selects all the author elements twice. "Suciu", "Suciu" is an
example of duplicate values, but not necessarily duplicate nodes.
Changing the wording from "duplicate elements" to "duplicate values"
is marginally better; my preference is to omit this phrase
altogether (it adds nothing to the explanation of projection anyway,
especially since the example already amply demonstrates the point,
and therefore this phrase can only confuse the reader).
<MF>Fixed</MF>
Issue #12: projection is simple? ( Section 2.2: Projection,
paragraph 1
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
high priority, editorial) Since when is projection simpler than,
say, addition? Also, this is an unusual use of the word simple,
since you later decompose projection into other (more atomic, and
therefore more simple) operations.
<MF>"The simplest operation" => "A basic operation..." </MF>
Issue #5: prior work ( Section 1: Introduction, paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
low priority, editorial) The prior work list should reference
XSL(T). Did tree grammars not influence the committee at all? Where
is the influence of NRA on this algebra? I cannot detect it in this
spec.
<MF>XSLT citation added. See "Laws" for reference to NRAs and
comprehensions.</MF>
Issue #6: hand waving ( Section 1: Introduction, paragraph 3
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
normal priority, editorial) The claim that the algebra can capture
many XML query languages is not substantiated. Which query
languages, specifically, does it support? Only many (not all)? If
so, which ones are not supported and why?
<MF>Fixed.</MF>
Issue #7: hand waving again ( Section 1: Introduction, paragraph 4
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
normal priority, editorial) The claim that types can detect certain
kinds of errors is not substantiated. Which errors, specifically,
does this type system detect? 2 The Algebra by Example
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>Fixed in intro.</MF>
Issue #19: fst, snd are superfluous ( Section 2.9: Querying order,
paragraph 3
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
high priority, core) The fst, snd elements in each pair are
superfluous (arguably, so is pair). That is, instead of
index(book0/author) returning pair [ fst [ 1 ], snd [ author [
"Abiteboul" ] ] ], ... as described in the spec (section 2.9,
Querying Order), I suggested that it should return 1, author [
"Abiteboul" ], ...
I believe you responded that the reason it doesn't do this is
because of typing. In fact, the spec seems to get the compile-time
type of this expression wrong. It says that the type of
index(book0/author) is
pair [ fst [ Integer ] , snd [ author [ String ] ] ] {1, *}
but I believe it should be
pair [ fst [ Integer ] , snd [ ??? ] {0, *}
(where ??? is replaced with something like UrTree). The type of the
argument to index() cannot be reflected in the (compile-time) type of
its
output. In that case, index() might as well return an expression
with type
(Integer, ???) { 0, *}
And of course, if the spec is referring to the runtime-type and
therefore is
correct, then index() should still return
( Integer, author [ String ] ) {1, *}
Either way, the pair/fst/snd elements are superfluous.
Also, the spec does not say whether this function is deterministic,
and if
so, whether the integers returned must be a 1-based index of the
elements in
the set.
2.12 Expanded names
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>
The function index is builtin and therefore the Algebra can provide a
more precise type rule that gives the most specific type for :
index(book0/author)
which is:
pair [ fst [ Integer ] , snd [ author [ String ] ] ] {1, *}
Typically, when querying the local order of elements in a forest, the
goal is to return, for example, the first element or some range of
elements.
Using the pair/fst/snd wrapper elements, it is trivial to return
the value associated with a particular index or range.
For example, to choose the first three authors:
for v in index(book0/author) do
where v/fst/data() >= 1 && v/fst/data() <= 3 do v/snd/*
Given your formulation, it is much more difficult to access the
original value. If you can explain how you would do so if
index returns a value with type (Integer, author[String]){0,*}, that
would be helpful.
</MF>
Issue #57: variables are read-only ( Section 2.6: Quantification,
paragraph 9
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) Local variables are read-only. Once a value has been
assigned to a variable introduced with let, there is no grammatical
expression that reassigns it. If the meaning of let is altered to
say that it "introduces or reassigns a value," then how does one
reassign a global variable (versus just declaring a local variable
that hides the global one while the local is in scope)? <MR>This
comment shows that we have to make clear that these are not
variables in the procedural programming language sense</MR>
<MF>
The Algebra is a functional language and does not provide an
assignment operator. Semantics of the let expressions is clarified
in text and in operational semantics.
</MF>
Issue #22: non-XPath compatible wildcard semantics ( Section 2.12:
Expanded names, paragraph 11
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) Making e/a shorthand for e/{*}a is a big
mistake. You're using the same syntax as XPath but with a markedly
different semantics. The expression e/a should have the same meaning
it has in XPath, which is e/{Null}a. (XPath Spec, section 2.3,
paragraph 3: "... if the QName does not have a prefix, then the
namespace URI is null".)
<MF>
Good point. Fixed.
</MF>
Issue #56: path order different from XPath ( Section 4.1: Relating
projection to iteration, paragraph 1
<http://www.w3.org/XML/Group/2000/11/xmlquery-algebra20001204.html>;
low priority, core) In XPath, the set value of a path is always
ordered in document order or reverse-document order. In the XML
Query Language, set order is arbitrary. Consequently, the meaning of
the expression e/a can differ from its meaning in XPath. This will
be especially troublesome when you add reverse axes; conversely,
correctly implementing the reverse axes on top of the XML Query
Language may be very difficult.
<MF>
Although the Algebra and XPath both have a '/' operator, it clearly
does not have the same semantics in each language nor was the
intention for both of them to have the same semantics. A joint
XSLT/Query task force is defining the semantics of XPath 2.0 in
terms of the Algebra. This effort will make the semantics of the
XPath '/' operator precise.
</MF>
Issue #32: wrong not equal operator ( Section 2.6: Quantification,
paragraph 7
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
normal priority, code example/sample) The third and fourth examples
incorrectly use operator <> instead of operator !=.
<MF>Fixed.</MF>
Issue #33: common idioms obfuscated ( Section 2.6: Quantification,
paragraph 1
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
high priority, core) This algebra obfuscates common idioms. For
example, the XPath x/y[@z=1] is a very common query. In this
algebra, this query becomes for b in x/y do where b/@z/data()=1 do b
when it's really select x/y where data(@z) = 1.
<MF>
This comment is unclear. The purpose of the Algebra is *not*
to define a concise syntax for common idioms in XPath nor for common
idioms in any query language. Its purpose is to define precisely
operators that can be used to define the expressive semantics of
very concise XPath or XQuery expressions. The Algebra
is *not* the user-level language: XQuery serves that role.
</MF>
Issue #58: variable scoping is not defined ( Section 2.6:
Quantification, paragraph 9
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) The scope of local variables is not defined. Is the
expression let v = e do let v2 = v do let v = e2 do v2 legal, and if
so, what is its value?
<MF>Fixed. Also defined precisely in operational semantics.</MF>
Issue #9: forest definition ( Section 2.1: Data and types, paragraph
8
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, type system) Is a forest ordered? The term sequence is
used but has not been defined. In any case, these definitions belong
in the data model spec, not in the algebra.
<MF>Resolved in data model document</MF>
Issue #10: is it a forest? ( Section 2.1: Data and types, paragraph
9
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
high priority, type system) Does the book type contain a forest? A
forest of what? How do the all-grouped attributes fit into this?
<MF>Fixed</MF>
Issue #14: attribute type ( Section 2.3: Atomic data, paragraph 4
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
normal priority, code example/sample) What is the type of
book0/@year? Presumably @year[ Integer], but then you should give
this example before book0/@year/data(). 2.4 Iteration
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>Fixed</MF>
Issue #18: for and where grammars introduce ambiguity ( Section 2.5:
Selection, paragraph 1
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
high priority, core) The grammar for nested for or where clauses
introduces ambiguity (this is the common if-then-else problem). For
example, consider the expression
for b in bib0/book do
for y in b/author do
where b/@year/data()<=2000 do
b
in which it is not
immediately clear whether the where clause applies to the inner or
the outer for loop. Although the greedy subrule does what most
people expect in this case, there are variations in which it does
not. (Again, this is the well-known if-then-else phenomenon.)
Issue #59: execution model is not defined ( Section 2.6:
Quantification, paragraph 9
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) The execution model is not defined. For example,
what is the legal outcome of an infinitely recursing expression?
Error? System lock-up? What happens when an undefined variable is
used? When an undefined function is called? When a function with
that name and number of arguments is defined, but the argument types
do not match? Etc., etc. 2.7 Join
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>
Added [Issue-0084: Execution model].
</MF>
Issue #20: strange example ( Section 2.12: Expanded names, paragraph
3
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
normal priority, code example/sample) This example qualifies all the
attributes with the same namespace as their parent. In practice,
this is extremely unusual; attributes are usually placed in a
namespace only when they are annotations from a different namespace
than the element's. Supply the XML to which this example
corresponds, and remember that attributes by default do not inherit
the namespace of their element.
<MF>Fixed.</MF>
Issue #21: wildcard types cannot be implemented ( Section 2.12:
Expanded names, paragraph 11
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) If x!y means any name in x except names in y, what
does x!y!z mean? In general, how do ! and | operate (precedence,
associativity)? Parentheses are required to force the desired
grouping of these two operators. Also, what does x!* mean? (There's
an infinite family of such examples.)
<MF>Added Issue-0085 : Semantics of Wildcard type</MF>
Issue #29: Use empty namespace to indicate Null ( Section 2.12:
Expanded names, paragraph 1
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
normal priority, syntax) Use empty brackets {} instead of {Null} to
denote the empty namespace.
<MF>Fixed.</MF>
Issue #38: namespace URIs should be strings ( Section 2.12: Expanded
names, paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) All of the expanded name examples using URIs look
like {http://uri}localname. The URI should be a string-valued
expression, both so that it can be computed (as suggested by the
grammar) and so that the expression can be readily parsed. 2.14
Structural recursion
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>Fixed</MF>
Issue #30: Empty choice symbol is strange ( Section 3.4: Types :
abstract syntax, paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
normal priority, syntax) The empty choice symbol appears to be O
(entity Ø). Is this correct?
<MR>This is nice to type but not nice for building a prototype
implementation.</MR>
<MF>Yes.</MF>
Issue #35: type operator precedence undefined ( Section 3.4: Types :
abstract syntax, paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) Type operator precedence is not defined (or
inferrable from the grammar). I implemented repetition precedes
choice precedes all precedes sequence (so sequence binds last) 3.6
Expressions
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>
Added Issue-0082 : Type and expression operator precedence
</MF>
Issue #36: parentheses not in type grammar ( Section 3.6:
Expressions, paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) The type grammar omits parentheses as a grouping
operator, but the HTML_body example (third example in section 2.15)
uses parentheses to group.
<MF>
Fixed in "Types : concrete syntax"
</MF>
Issue #37: expression operator precedence undefined ( Section 3.6:
Expressions, paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) Operator precedence is not defined (or inferrable
from the grammar) for expressions. I implemented projection precedes
type precedes binary-op precedes sequence (so sequence binds
last). Among the binary operators, I implemented (highest precedence
to lowest): additive (+ -), relational(< > <= >=), equality(= !=),
and, or.
<MF>
Added Issue-0082 : Type and expression operator precedence
</MF>
Issue #39: optional query keyword ( Section 3.6: Expressions,
paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
high priority, core) Is query optional? None of the examples use it,
and it seems like a query fragment should default to being a query.
Issue #40: operator semantics not defined ( Section 3.6:
Expressions, paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) None of the operator semantics are defined. What is
2 + "2"? 2 > "1"? Etc., etc.
<MR>
This is mainly a reminder that this area needs urgent attention
</MR>
<MF>
See Issue-0056 : Operators on Simple Types
</MF>
Issue #42: missing arithmetic operators ( Section 3.6: Expressions,
paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) There are no operators for multiplication, division,
or modulo.
<MF>
See Issue-0056 : Operators on Simple Types
</MF>
Issue #46: name collisions ( Section 3.6: Expressions, paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) Name collisions are not explained. For example: Can
a variable and a function have the same name?
<MF>
Added Issue-0046 : Syntactic rules
</MF>
Issue #47: reserved keywords ( Section 3.6: Expressions, paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) Are all keywords reserved? If so, then how does one
use them in path expressions, like error/@name? If not, then what
are the rules to disambiguate (for example, how to distinguish the
error keyword from an error identifier)?
<MF>
Added Issue-0046 : Syntactic rules
</MF>
Issue #48: sort is wrong ( Section 3.6: Expressions, paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) Sort is described as a built-in function in the
grammar, but as an expression sort Var in Expr by Expr in section
2.10 Sorting.
<MF>Fixed</MF>
Issue #50: error does not fit into the type system ( Section 3.6:
Expressions, paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) Supposedly every expression has a well-defined
compile-time and run-time type. What is the type of error? Several
examples in the spec look like fun some_function ( a : Type1) :
Type2 = match a case x : Type3 do x else error suggesting that
whatever the type of error is, it is compatible with every definable
type.
<MF>
The type of error is empty (Ø). For its relationship to other
types, see subsection : Equivalences and Subtyping. An implementation
is
free to decide how to evaluate the error expression : for example,
it could raise an exception in the given programming environment.
</MF>
Issue #54: function names should be computable ( Section 3.6:
Expressions, paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
normal priority, core) Constructor names are computeable, but
function names are not.
<MF>
The Algebra is a first order functional language, i.e., functions
are not first-class values. A function's value is determined at
query-analysis time, therefore it is not possible to select or to
compute dynamically a function to apply.
<MF>
Issue #61: wrong grammar for expanded names ( Section 3.6:
Expressions, paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
high priority, core) The grammar definition {Exp}Exp appears to be
incorrect. Or is { a, b} () a legal expanded expression? 4
Equivalences
<http://www.w3.org/XML/Group/2000/11/xmlquery-algebra20001204.html>
4.1 Relating projection to iteration
<http://www.w3.org/XML/Group/2000/11/xmlquery-algebra20001204.html>
<MF>
The type rule for {Exp1}Exp2 requires that Exp1 be a URI expression
and Exp2 be an NCName.
</MF>
VII. Issues related to absence of join operator
===============================================
<Resolution>
Added [Issue-0087 : More examples of "joins"]
Added citation on join evaluation.
In particular, see:
@article{graefe93,
title="Query Evaluation Techniques for Large Databases",
author="Goetz Graefe",
journal="{ACM} Computing Surveys",
volume="25(2)",
pages="73--170",
year=1993,
month="June"}
</Resolution>
Issue #27: no join operator ( Section 2.7: Join, paragraph 1
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
normal priority, core) There is no join operation in this
algebra. Joins are effected through nested for loops and possibly
applying selection criteria to express the join. Implementations
must analyze a query to determine if/when joins occur and which
selection critieria (if any) are join conditions. However, this
algorithm is not addressed or even acknowledged by the
specification.
Issue #28: crossproduct instead of a join ( Section 2.7: Join,
paragraph 5
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) This section describes a diagonal of a
cross-product, but the result structure does not reflect the
join. Instead, it's just a filtered cross-product. When the join is
one-to-many or many-to-many, the results will contain one element
for every member of the (predicated) cross-product, instead of one
member for each element/attribute in the outer forest. The given
example returns, for each distinct (book, review) pair whose titles
match, one element. A result that reflects the join would return one
element for each book, containing data from each review for that
book (books joined with reviews). Also, I want an example of a
many-to-many join. What if multiple books matched multiple reviews
(a many-to-many join)? The outer for loop iterates over one book at
a time, so the result structure is the same as for a one-to-many
join between books and reviews.
Issue #60: many-many join not possible? ( Section 2.7: Join,
paragraph 1
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
critical, core) Please demonstrate a many:many join. The example
demonstrates a 1:many join. Demonstrate other types of joins, too;
e.g., contrast an "inner join" with an "outer join".
2.9 Querying order
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MR>
I would like to extend this issue by adding that formulations of
full outer-joins seem to be problematic
</MR>
<MR>We currently work on adding more examples for the different kind
of joins (see Peter Fankhauser's mail on this topic)</MR>
VII. Issues noted for future consideration
=========================================
Issue #15: types everywhere ( Section 2: The Algebra by Example,
paragraph 1
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
high priority, editorial) Readers will expect "The Algebra by
Example" to focus on the algebraic operations, their syntax and
semantics. Instead, the focus is on the type system throughout. For
example, the sections on projection and iteration devote more
paragraphs to explaining the types of their expressions than to the
description of the expressions themselves. Put the type system in
its own section, after The Algebra By Example. This change would
vastly improve the flow of this specification. 2.1 Data and types
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>
This is a very helpful suggestion, but we cannot implement this change
in the next draft. It will be considered for a future draft.
<MF>
Issue #16: describe wildcard during projection ( Section 2.2:
Projection, paragraph 1
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
normal priority, editorial) After resolving issue 15, you should
also enhance the description of projection to include namespaces,
wildcards, etc. 2.3 Atomic data
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>
Future examples
</MF>
Issue #51: no typeof() operator ( Section 3.6: Expressions,
paragraph 2
<http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
high priority, core) There is no way to query the (arbitrary)
run-time type of an expression. The algebra needs to be extended to
allow expressions like:
fun some_function ( a : Type1, b : Type2) : Type3 =
match a case x : typeof(b) do x
else error
Note that this is not equivalent to
fun some_function ( a : Type1, b : Type2) : Type3 =
match a case x : Type2 do x else error
<MF>
The data model accessor type() returns the run time type of a data
model value. Currently, there is no operator to compare two dynamic
type expressions.
</MF>
--
Mary Fernandez AT&T Labs - Research
Principal Technical Staff Member 180 Park Ave., Bldg 103, E243
mff@research.att.com Florham Park, NJ 07932-0971
http://www.research.att.com/~mff 973-360-8679 FAX: 973-360-8187
Received on Wednesday, 7 February 2001 20:52:44 UTC