- From: Michael Rys <mrys@microsoft.com>
- Date: Wed, 7 Feb 2001 09:13:47 -0800
- To: Michael Brundage <michaelb@microsoft.com>
- Cc: "'www-xml-query-comments@w3.org'" <www-xml-query-comments@w3.org>
Dear Michael, please receive our comments on your feedback. Feel free to provide us with further feedback, especially if you think that a resolution and changes incorporated into the latest draft are not sufficient and where we kindly ask you for some more information (e.g., issue #19). I added some remarks to Mary's remarks (Mary please note) in <MR> tags. Best regards Michael -----Original Message----- From: Mary F. Fernandez [mailto:mff@research.att.com] Sent: Friday, January 19, 2001 8:47 AM To: Michael Rys Cc: xmlquery Subject: Michael Brundage's comments on Algebra Hi Michael, Please find below my responses to Michael Brundage's comments. Issues are groups by subject and resolutions are tagged <Resolution>. My remarks are tagged with <MF>. Michael Rys' in <MR>. Please review, annotate, and respond to Michael Brundage and cc: w3c-xml-query-comments@w3c.org. thx mff I. Issues related to lexical representation of literals: ======================================================== <Resolution> Added [Issue-0081: Lexical representation of Schema simple types] </Resolution> Issue #41: no boolean literals ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; high priority, core) There are no literal expressions for the boolean constants true and false. Issue #43: string lexical representation not defined ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) The lexical representation of strings is not defined. Please do not use the XPath definition (which is pathetic). Use a definition that allows both single- and double-quoted strings, and also allows escape sequences. I implemented string := '"' (''' | escape | ~['"\])* '"' | ''' ('"' | escape | ~['"\])* ''' escape := '\' any-character ; (the processed escapes are: \t, \n, \r, \\, \" and \' ) Issue #44: number lexical representation not defined ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) The lexical representation of numbers is not defined. I implemented number := ('+' | '-')? digit+ ('.' digit+)? digit := '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9' Issue #45: other lexical issues ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) Whitespace is not defined. I implemented whitespace := ' ' | '\t' | '\r' | '\n' Except for whitespace in string literals (which is of course preserved), whitespace is completely ignored. Issue #55: scalar literals not defined ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) Lexical representations for other scalar types (IDREFS, NMTOKENS, date, time, etc.) are not defined. How are scalar values of these types to be represented? My personal preference is cast 'a1 a2' as IdrefsScalar Issue #31: Constants not defined ( Section 3.6: Expressions, paragraph 3 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, syntax) Constants are not defined. What is the syntax of numeric constants? String constants? Boolean constants? Others? Also, this is somewhat odd terminology, since for example the empty expression () is a constant. II. Issues addressed by addition of operational semantics ========================================================= <Resolution> Unless otherwise noted, these issues are resolved by the addition of Section 5 Dynamic/Operational Semantics. </Resolution> Issue #11: comma operator ( Section 2.1: Data and types, paragraph 10 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; high priority, core) The ampersand operator is defined, but the comma operator is not. 2.2 Projection <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html> Issue #17: for is not a forest ( Section 2.4: Iteration, paragraph 7 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) This section says that the value of a for expression is a forest. However, section 2.1 defines a forest to be a sequence of (zero or more) attributes and elements. Clearly the result of the expression for b in bib0/book do b/@year/data() fails to meet this definition. Please clarify the meaning of for. <MR> In my opinion, section 2.1 should defer to the datamodel definition of a forest which can be a sequence of value nodes </MR> <MF> Clarified in text and referred to data model & operational semantics. </MF> Issue #26: empty() ( Section 2.6: Quantification, paragraph 5 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; normal priority, core) What is the result of empty( (), () ) ? In general, when do operators and functions flatten or unnest their arguments? <MF> Clarified in data model and in operational semantics. </MF> Issue #25: set operations are not defined ( Section 2.5: Selection, paragraph 9 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) What is the result of the expression for b in bib0 do where b/book/@year/data()<=2000 do 1? Define the meaning for operators like less-than-equals on sets, forests, etc. 2.6 Quantification <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html> <MF> Defined in data model and in operational semantics. </MF> Issue #52: automatic type coercions are not defined ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) Under what circumstances is this expression valid or invalid? fun some_function ( a : Type1) : Type2 = match a case x : Type3 do x else error <MF> Operational semantics of match defines precisely. </MF> Issue #53: order of evaluation is not defined ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) Order of evaluation of expressions is not defined (this is related to operator precedence, which is also not defined). Do and and or shortcut? Is (a, b, c) equivalent to ((a, b), c) or (a, (b, c))? <MF> Defined in operational semantics. </MF> III. Operational semantics of match =================================== <Resolution> Unless otherwise noted, these issues are resolved by the addition in Section 5 Dynamic/Operational Semantics of semantics of match. Added [Issue-0083: Expressive power and complexity of match expression] Additional comments follow. </Resolution> Issue #24: match semantics ( Section 2.14: Structural recursion, paragraph 11 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; high priority, type system) How does match handle redundant cases? The description of match seems to imply that it is equivalent to a large if then else if then else if ... else block instead of a switch (because it takes the first matching branch). This will be terrible for performance. Also, the case statements appear to accept any type, including types not known at compile time (such as type variables or expressions involving type variables). Consequently, match becomes a run-time concept (in which case, I have to wonder what value it brings to your type system.) Again, this will be absolutely terrible for performance. 2.15 Functions for all well-formed documents <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html> <MF> The values of all type expressions are known at compile time <MR>I assume that you mean query-analysis time.</MR>, even the values of type variables, which are defined statically and are immutable. The static semantics of match determines the most precise type for the expression. The operational semantics defines precisely the run-time semantics of match, which compares the run-time type of the value to match and evaluates the case rules in order: the value of the first case rule that matches is returned. This is the simplest, *operational* semantics; it is not necessarily the most efficient implementation of match. It is possible to implement match very efficiently. For example, pattern matching in ML is implemented efficiently. We can short circuit the run-time evaluation of case rules whose static type are guaranteed not to match the match value. In addition, the user-level syntax in XQuery may choose to provide a restricted form of match to the user, i.e., one that does not permit arbitrary type expressions in the case rules. Consider the example : type exType = String | b [ Integer ] let v = b [ 10 ] : exType do match v case v2 : String do v2 case v3 : b [ Integer ] do v3 else a [ Float ] : String | b [ Integer ] ==> b [ 10 ] Note that the *type* of the match expression is String | b [ Integer ], because at *query-analysis time*, the type of v is String | b [ Integer ] even though its value has type b [ String ]. Even though the else clause returns a value with type a [ Float ], the static type rules eliminate this possibility because it can never match String | b [ Integer ]. At query evaluation time, the dynamic type of v is b [ Integer ], therefore at run time, it matches the second case rule and returns the value b [ 10 ]. </MF> Issue #49: match/case is very expensive ( Section 2.15: Functions for all well-formed documents, paragraph 12 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) The typeguard is very expensive. The fourth example in this section, html_of_xml, says that the first branch of the match checks whether the (run-time) value of x is a subtype of AnyScalar, and if not then the second branch checks to see if x is a subtype of AnyAttribute, and so on. In the worst case, this kind of type-checking requires traversing a structure as large as the instance of x. Consider for example: fun not_fun ( x : AnyComplex ) : Integer = match x case t1 : AnyTree do 1 case t2 : AnyTree{0, 2} do 2 case t3 : * [ * [ * [ * [ * [ * [ AnyAttribute ] ] ] ] ] ] do 3 else error 3 Algebra Components <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html> 3.4 Types : abstract syntax <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html> <MF> Regarding "In the worst case, this kind of type-checking requires traversing a structure as large as the instance of x." No, checking type subsumption and intersection requires traversing a structure as large as the *type* of x, not of the instance of x itself. All values carry a representation of their type. Call the runtime type of x, T. In the above example, there will be three type tests: (1) T /\ AnyTree != 0 (and if that fails then...) (2) T /\ AnyTree{0,2} != 0 (and if that fails then...) (3) T /\ * [ * [ * [ * [ * [ * [ AnyAttribute ] ] ] ] ] ]. The operational semantics of match should make this clear. <MR>I think Michael's worst case is based on the fact that you may not know the type of x until you have traversed the instance (e.g., in streaming applications). Michael, is this correct?</MR> </MF> IV. Issues addressed by definition of operators by joint XSLT/Schema/Query task force on operators ========================================================= <Resolution> The operational semantics defines the relationship of the Algebra to the Data model. Other open issues are cleared identified in Introduction: [Issue-0018: Align algebra types with schema]: The Algebra's type system is not yet aligned with XML Schema. [Issue-0056: Operators on Simple Types]: A joint XSLT/Schema/Query task force is chartered to define the operators on Schema simple types. The Algebra will adopt the operators defined by that group. [Issue-0081: Lexical representation of Schema simple types]: The lexical representation of Schema simple types must be defined for the Algebra and XQuery. </Resolution> General comments: The currently proposed XML Query Algebra fails to meet even its own requirements; specifically, it does not use the XML Query Data Model or XSD type system. Essential language issues such as operator semantics, operator precedence and variable scope are entirely ignored. V. Issues related to alignment of XML Query Algebra's type system with XML Schema =================================================================== <Resolution> Alignment with Schema is open issue: [Issue-0018: Align algebra types with schema]: The Algebra's type system is not yet aligned with XML Schema. </Resolution> Issue #1: type system ( Section 2.1: Data and types, paragraph 6 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, type system) The XML Query Algebra does not use the XSD type system, but instead uses its own homegrown type system (derived from the XDuce research project). This violates the requirements of the XML Query Language. Based on my experience with prototype implementations of the algebra, I claim that this type system is not necessary for an efficient implementation of the algebra. Furthermore, I will now provide several demonstrations that this type system is both ill-defined and incompatible with the type system of XSD. In section 2.4, an example is given that constructs a new book element with different content than the book type already present in the type system. Because the XML Query type system differentiates types by name, a query cannot simultaneously use both book types. Moreover, there are issues in serializing out the results of such a query with its schema, because of the name collision. (You could try to work around this problem by auto-generating unique type names in the schema, but then you will not be able to roundtrip. You could work around this problem by creating a namespace for XML Query annotations in schemas, and adding annotations to the appinfo element in the XSD (or possibly attributes on the types themselves) that describes how to restore the original type names. But yuck, this is a terrible solution.) Also, not all types have a single name. Consider the simple example: for b in bib0/book do b/author, b/title This algebraic expression is certainly well-formed, returning an expression with type author [ String ], title [ String ] . However, because this is a sequence type, it has no single (top-level element) name. Suppose we assign this expression to the variable v. Then what is the result of match v case v1 : * do "case 1" case v2 : *, * do "case 2" case v3 : *, *, * do "case 3" case v4 : author [ String ], title [ String ] do "case 4" else error Does the first case match? Or the second? Or does only the fourth case match? (to be completed) If you're going to use the XSD type system, use it exclusively. If the XSD type system isn't up to the challenge, get the schema working group to fix it while you still can. And finally, if all else fails, add or change only those things absolutely necessary to implement the XML Query Language. Don't create yet another type system that implementors must support. <MR> Our proposal is to use the XSD type system as implied by XSD and cooperate on the formalisation based on MSL.We see the type-related comments in this feedback also as potential input to the type formalisation work. </MR> Issue #23: so how do types work, really? ( Section 2.4: Iteration, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; high priority, type system) The return type of the first iteration is fully expanded into book[ author [ String ] { 1, *}, title [ String ] ]{0, *} but in section 2.5 the return type is collapsed into Book{0,*}. How does the algebra system know which return type to use? It seems to me that in every heuristic imaginable, Book{0,*} is in-between the fully-expanded type and UrTree. Your response is probably "it's Book because we declared it that way," in which case my follow-up question goes something like this: Consider a semi-redundant type definition such as type Normal = (a [String ] | String | UrElement | UrTree) {0, *} and an instance with this type let n : Normal = anything, it doesn't matter. Then what is the compile-time type of the expression match n case s : UrScalar do s case n : UrElement do n/* else n? If this example is not quite legal or is too trivial, tell me and I will come up with a better example that still demonstrates the same point. 2.5 Selection <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html> VI. Miscellaneous editorial issues =================================== 0 Status <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html> Issue #3: Redmond, not Redmont ( Section 0: Status, paragraph 7 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; low priority, editorial) Redmond is misspelled Issue #4: accents ( <MF>Fixed.</MF> Section 0: Status, paragraph 11 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; low priority, editorial) These names are misspelled; they should contain accented characters (Fernandez, Simeon). <MF>Fixed.</MF> Issue #13: compile-time vs. run-time ( Section 2.2: Projection, paragraph 8 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) You cannot use the words compile-time and run-time (which have important significance for implementations) without first defining them. Furthermore, this is a different description than given previously for duplicates (where the distinction was made between the type of the expression, and the type of its value -- which I guess you're trying to say are compile-time and run-time, respectively?). Also, the introduction of the notions compile-time and run-time has the appearance of constraining compliant implementations to compile first and then run, instead of executing XML queries in an interpreted way. <MF> Changed "compile-time" to "query-analysis time" and "run-time" to "query-evaluation time" with clarification in text. Also added [Issue-0084: Execution model]. </MF> Issue #2: meaning of duplicate ( Section 2.2: Projection, paragraph 3 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; high priority, editorial) I strongly object to the use of the word "duplicate" in this sentence. Duplicate has a precise meaning in XPath, which is different from its use here. An example of duplicate elements is the expression bib0/book/author, bib0/book/author which selects all the author elements twice. "Suciu", "Suciu" is an example of duplicate values, but not necessarily duplicate nodes. Changing the wording from "duplicate elements" to "duplicate values" is marginally better; my preference is to omit this phrase altogether (it adds nothing to the explanation of projection anyway, especially since the example already amply demonstrates the point, and therefore this phrase can only confuse the reader). <MF>Fixed</MF> Issue #12: projection is simple? ( Section 2.2: Projection, paragraph 1 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; high priority, editorial) Since when is projection simpler than, say, addition? Also, this is an unusual use of the word simple, since you later decompose projection into other (more atomic, and therefore more simple) operations. <MF>"The simplest operation" => "A basic operation..." </MF> Issue #5: prior work ( Section 1: Introduction, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; low priority, editorial) The prior work list should reference XSL(T). Did tree grammars not influence the committee at all? Where is the influence of NRA on this algebra? I cannot detect it in this spec. <MF>XSLT citation added. See "Laws" for reference to NRAs and comprehensions.</MF> Issue #6: hand waving ( Section 1: Introduction, paragraph 3 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; normal priority, editorial) The claim that the algebra can capture many XML query languages is not substantiated. Which query languages, specifically, does it support? Only many (not all)? If so, which ones are not supported and why? <MF>Fixed.</MF> Issue #7: hand waving again ( Section 1: Introduction, paragraph 4 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; normal priority, editorial) The claim that types can detect certain kinds of errors is not substantiated. Which errors, specifically, does this type system detect? 2 The Algebra by Example <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html> <MF>Fixed in intro.</MF> Issue #19: fst, snd are superfluous ( Section 2.9: Querying order, paragraph 3 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; high priority, core) The fst, snd elements in each pair are superfluous (arguably, so is pair). That is, instead of index(book0/author) returning pair [ fst [ 1 ], snd [ author [ "Abiteboul" ] ] ], ... as described in the spec (section 2.9, Querying Order), I suggested that it should return 1, author [ "Abiteboul" ], ... I believe you responded that the reason it doesn't do this is because of typing. In fact, the spec seems to get the compile-time type of this expression wrong. It says that the type of index(book0/author) is pair [ fst [ Integer ] , snd [ author [ String ] ] ] {1, *} but I believe it should be pair [ fst [ Integer ] , snd [ ??? ] {0, *} (where ??? is replaced with something like UrTree). The type of the argument to index() cannot be reflected in the (compile-time) type of its output. In that case, index() might as well return an expression with type (Integer, ???) { 0, *} And of course, if the spec is referring to the runtime-type and therefore is correct, then index() should still return ( Integer, author [ String ] ) {1, *} Either way, the pair/fst/snd elements are superfluous. Also, the spec does not say whether this function is deterministic, and if so, whether the integers returned must be a 1-based index of the elements in the set. 2.12 Expanded names <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html> <MF> The function index is builtin and therefore the Algebra can provide a more precise type rule that gives the most specific type for : index(book0/author) which is: pair [ fst [ Integer ] , snd [ author [ String ] ] ] {1, *} Typically, when querying the local order of elements in a forest, the goal is to return, for example, the first element or some range of elements. Using the pair/fst/snd wrapper elements, it is trivial to return the value associated with a particular index or range. For example, to choose the first three authors: for v in index(book0/author) do where v/fst/data() >= 1 && v/fst/data() <= 3 do v/snd/* Given your formulation, it is much more difficult to access the original value. If you can explain how you would do so if index returns a value with type (Integer, author[String]){0,*}, that would be helpful. </MF> Issue #57: variables are read-only ( Section 2.6: Quantification, paragraph 9 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) Local variables are read-only. Once a value has been assigned to a variable introduced with let, there is no grammatical expression that reassigns it. If the meaning of let is altered to say that it "introduces or reassigns a value," then how does one reassign a global variable (versus just declaring a local variable that hides the global one while the local is in scope)? <MR>This comment shows that we have to make clear that these are not variables in the procedural programming language sense</MR> <MF> The Algebra is a functional language and does not provide an assignment operator. Semantics of the let expressions is clarified in text and in operational semantics. </MF> Issue #22: non-XPath compatible wildcard semantics ( Section 2.12: Expanded names, paragraph 11 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) Making e/a shorthand for e/{*}a is a big mistake. You're using the same syntax as XPath but with a markedly different semantics. The expression e/a should have the same meaning it has in XPath, which is e/{Null}a. (XPath Spec, section 2.3, paragraph 3: "... if the QName does not have a prefix, then the namespace URI is null".) <MF> Good point. Fixed. </MF> Issue #56: path order different from XPath ( Section 4.1: Relating projection to iteration, paragraph 1 <http://www.w3.org/XML/Group/2000/11/xmlquery-algebra20001204.html>; low priority, core) In XPath, the set value of a path is always ordered in document order or reverse-document order. In the XML Query Language, set order is arbitrary. Consequently, the meaning of the expression e/a can differ from its meaning in XPath. This will be especially troublesome when you add reverse axes; conversely, correctly implementing the reverse axes on top of the XML Query Language may be very difficult. <MF> Although the Algebra and XPath both have a '/' operator, it clearly does not have the same semantics in each language nor was the intention for both of them to have the same semantics. A joint XSLT/Query task force is defining the semantics of XPath 2.0 in terms of the Algebra. This effort will make the semantics of the XPath '/' operator precise. </MF> Issue #32: wrong not equal operator ( Section 2.6: Quantification, paragraph 7 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; normal priority, code example/sample) The third and fourth examples incorrectly use operator <> instead of operator !=. <MF>Fixed.</MF> Issue #33: common idioms obfuscated ( Section 2.6: Quantification, paragraph 1 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; high priority, core) This algebra obfuscates common idioms. For example, the XPath x/y[@z=1] is a very common query. In this algebra, this query becomes for b in x/y do where b/@z/data()=1 do b when it's really select x/y where data(@z) = 1. <MF> This comment is unclear. The purpose of the Algebra is *not* to define a concise syntax for common idioms in XPath nor for common idioms in any query language. Its purpose is to define precisely operators that can be used to define the expressive semantics of very concise XPath or XQuery expressions. The Algebra is *not* the user-level language: XQuery serves that role. </MF> Issue #58: variable scoping is not defined ( Section 2.6: Quantification, paragraph 9 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) The scope of local variables is not defined. Is the expression let v = e do let v2 = v do let v = e2 do v2 legal, and if so, what is its value? <MF>Fixed. Also defined precisely in operational semantics.</MF> Issue #9: forest definition ( Section 2.1: Data and types, paragraph 8 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, type system) Is a forest ordered? The term sequence is used but has not been defined. In any case, these definitions belong in the data model spec, not in the algebra. <MF>Resolved in data model document</MF> Issue #10: is it a forest? ( Section 2.1: Data and types, paragraph 9 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; high priority, type system) Does the book type contain a forest? A forest of what? How do the all-grouped attributes fit into this? <MF>Fixed</MF> Issue #14: attribute type ( Section 2.3: Atomic data, paragraph 4 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; normal priority, code example/sample) What is the type of book0/@year? Presumably @year[ Integer], but then you should give this example before book0/@year/data(). 2.4 Iteration <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html> <MF>Fixed</MF> Issue #18: for and where grammars introduce ambiguity ( Section 2.5: Selection, paragraph 1 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; high priority, core) The grammar for nested for or where clauses introduces ambiguity (this is the common if-then-else problem). For example, consider the expression for b in bib0/book do for y in b/author do where b/@year/data()<=2000 do b in which it is not immediately clear whether the where clause applies to the inner or the outer for loop. Although the greedy subrule does what most people expect in this case, there are variations in which it does not. (Again, this is the well-known if-then-else phenomenon.) Issue #59: execution model is not defined ( Section 2.6: Quantification, paragraph 9 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) The execution model is not defined. For example, what is the legal outcome of an infinitely recursing expression? Error? System lock-up? What happens when an undefined variable is used? When an undefined function is called? When a function with that name and number of arguments is defined, but the argument types do not match? Etc., etc. 2.7 Join <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html> <MF> Added [Issue-0084: Execution model]. </MF> Issue #20: strange example ( Section 2.12: Expanded names, paragraph 3 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; normal priority, code example/sample) This example qualifies all the attributes with the same namespace as their parent. In practice, this is extremely unusual; attributes are usually placed in a namespace only when they are annotations from a different namespace than the element's. Supply the XML to which this example corresponds, and remember that attributes by default do not inherit the namespace of their element. <MF>Fixed.</MF> Issue #21: wildcard types cannot be implemented ( Section 2.12: Expanded names, paragraph 11 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) If x!y means any name in x except names in y, what does x!y!z mean? In general, how do ! and | operate (precedence, associativity)? Parentheses are required to force the desired grouping of these two operators. Also, what does x!* mean? (There's an infinite family of such examples.) <MF>Added Issue-0085 : Semantics of Wildcard type</MF> Issue #29: Use empty namespace to indicate Null ( Section 2.12: Expanded names, paragraph 1 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; normal priority, syntax) Use empty brackets {} instead of {Null} to denote the empty namespace. <MF>Fixed.</MF> Issue #38: namespace URIs should be strings ( Section 2.12: Expanded names, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) All of the expanded name examples using URIs look like {http://uri}localname. The URI should be a string-valued expression, both so that it can be computed (as suggested by the grammar) and so that the expression can be readily parsed. 2.14 Structural recursion <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html> <MF>Fixed</MF> Issue #30: Empty choice symbol is strange ( Section 3.4: Types : abstract syntax, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; normal priority, syntax) The empty choice symbol appears to be O (entity Ø). Is this correct? <MR>This is nice to type but not nice for building a prototype implementation.</MR> <MF>Yes.</MF> Issue #35: type operator precedence undefined ( Section 3.4: Types : abstract syntax, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) Type operator precedence is not defined (or inferrable from the grammar). I implemented repetition precedes choice precedes all precedes sequence (so sequence binds last) 3.6 Expressions <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html> <MF> Added Issue-0082 : Type and expression operator precedence </MF> Issue #36: parentheses not in type grammar ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) The type grammar omits parentheses as a grouping operator, but the HTML_body example (third example in section 2.15) uses parentheses to group. <MF> Fixed in "Types : concrete syntax" </MF> Issue #37: expression operator precedence undefined ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) Operator precedence is not defined (or inferrable from the grammar) for expressions. I implemented projection precedes type precedes binary-op precedes sequence (so sequence binds last). Among the binary operators, I implemented (highest precedence to lowest): additive (+ -), relational(< > <= >=), equality(= !=), and, or. <MF> Added Issue-0082 : Type and expression operator precedence </MF> Issue #39: optional query keyword ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; high priority, core) Is query optional? None of the examples use it, and it seems like a query fragment should default to being a query. Issue #40: operator semantics not defined ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) None of the operator semantics are defined. What is 2 + "2"? 2 > "1"? Etc., etc. <MR> This is mainly a reminder that this area needs urgent attention </MR> <MF> See Issue-0056 : Operators on Simple Types </MF> Issue #42: missing arithmetic operators ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) There are no operators for multiplication, division, or modulo. <MF> See Issue-0056 : Operators on Simple Types </MF> Issue #46: name collisions ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) Name collisions are not explained. For example: Can a variable and a function have the same name? <MF> Added Issue-0046 : Syntactic rules </MF> Issue #47: reserved keywords ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) Are all keywords reserved? If so, then how does one use them in path expressions, like error/@name? If not, then what are the rules to disambiguate (for example, how to distinguish the error keyword from an error identifier)? <MF> Added Issue-0046 : Syntactic rules </MF> Issue #48: sort is wrong ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) Sort is described as a built-in function in the grammar, but as an expression sort Var in Expr by Expr in section 2.10 Sorting. <MF>Fixed</MF> Issue #50: error does not fit into the type system ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) Supposedly every expression has a well-defined compile-time and run-time type. What is the type of error? Several examples in the spec look like fun some_function ( a : Type1) : Type2 = match a case x : Type3 do x else error suggesting that whatever the type of error is, it is compatible with every definable type. <MF> The type of error is empty (Ø). For its relationship to other types, see subsection : Equivalences and Subtyping. An implementation is free to decide how to evaluate the error expression : for example, it could raise an exception in the given programming environment. </MF> Issue #54: function names should be computable ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; normal priority, core) Constructor names are computeable, but function names are not. <MF> The Algebra is a first order functional language, i.e., functions are not first-class values. A function's value is determined at query-analysis time, therefore it is not possible to select or to compute dynamically a function to apply. <MF> Issue #61: wrong grammar for expanded names ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; high priority, core) The grammar definition {Exp}Exp appears to be incorrect. Or is { a, b} () a legal expanded expression? 4 Equivalences <http://www.w3.org/XML/Group/2000/11/xmlquery-algebra20001204.html> 4.1 Relating projection to iteration <http://www.w3.org/XML/Group/2000/11/xmlquery-algebra20001204.html> <MF> The type rule for {Exp1}Exp2 requires that Exp1 be a URI expression and Exp2 be an NCName. </MF> VII. Issues related to absence of join operator =============================================== <Resolution> Added [Issue-0087 : More examples of "joins"] Added citation on join evaluation. In particular, see: @article{graefe93, title="Query Evaluation Techniques for Large Databases", author="Goetz Graefe", journal="{ACM} Computing Surveys", volume="25(2)", pages="73--170", year=1993, month="June"} </Resolution> Issue #27: no join operator ( Section 2.7: Join, paragraph 1 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; normal priority, core) There is no join operation in this algebra. Joins are effected through nested for loops and possibly applying selection criteria to express the join. Implementations must analyze a query to determine if/when joins occur and which selection critieria (if any) are join conditions. However, this algorithm is not addressed or even acknowledged by the specification. Issue #28: crossproduct instead of a join ( Section 2.7: Join, paragraph 5 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) This section describes a diagonal of a cross-product, but the result structure does not reflect the join. Instead, it's just a filtered cross-product. When the join is one-to-many or many-to-many, the results will contain one element for every member of the (predicated) cross-product, instead of one member for each element/attribute in the outer forest. The given example returns, for each distinct (book, review) pair whose titles match, one element. A result that reflects the join would return one element for each book, containing data from each review for that book (books joined with reviews). Also, I want an example of a many-to-many join. What if multiple books matched multiple reviews (a many-to-many join)? The outer for loop iterates over one book at a time, so the result structure is the same as for a one-to-many join between books and reviews. Issue #60: many-many join not possible? ( Section 2.7: Join, paragraph 1 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; critical, core) Please demonstrate a many:many join. The example demonstrates a 1:many join. Demonstrate other types of joins, too; e.g., contrast an "inner join" with an "outer join". 2.9 Querying order <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html> <MR> I would like to extend this issue by adding that formulations of full outer-joins seem to be problematic </MR> <MR>We currently work on adding more examples for the different kind of joins (see Peter Fankhauser's mail on this topic)</MR> VII. Issues noted for future consideration ========================================= Issue #15: types everywhere ( Section 2: The Algebra by Example, paragraph 1 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; high priority, editorial) Readers will expect "The Algebra by Example" to focus on the algebraic operations, their syntax and semantics. Instead, the focus is on the type system throughout. For example, the sections on projection and iteration devote more paragraphs to explaining the types of their expressions than to the description of the expressions themselves. Put the type system in its own section, after The Algebra By Example. This change would vastly improve the flow of this specification. 2.1 Data and types <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html> <MF> This is a very helpful suggestion, but we cannot implement this change in the next draft. It will be considered for a future draft. <MF> Issue #16: describe wildcard during projection ( Section 2.2: Projection, paragraph 1 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; normal priority, editorial) After resolving issue 15, you should also enhance the description of projection to include namespaces, wildcards, etc. 2.3 Atomic data <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html> <MF> Future examples </MF> Issue #51: no typeof() operator ( Section 3.6: Expressions, paragraph 2 <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>; high priority, core) There is no way to query the (arbitrary) run-time type of an expression. The algebra needs to be extended to allow expressions like: fun some_function ( a : Type1, b : Type2) : Type3 = match a case x : typeof(b) do x else error Note that this is not equivalent to fun some_function ( a : Type1, b : Type2) : Type3 = match a case x : Type2 do x else error <MF> The data model accessor type() returns the run time type of a data model value. Currently, there is no operator to compare two dynamic type expressions. </MF> -- Mary Fernandez AT&T Labs - Research Principal Technical Staff Member 180 Park Ave., Bldg 103, E243 mff@research.att.com Florham Park, NJ 07932-0971 http://www.research.att.com/~mff 973-360-8679 FAX: 973-360-8187
Received on Wednesday, 7 February 2001 20:52:44 UTC