FW: Michael Brundage's comments on Algebra

Dear Michael, please receive our comments on your feedback. Feel free to
provide us with further feedback, especially if you think that a resolution
and changes incorporated into the latest draft are not sufficient and where
we kindly ask you for some more information (e.g., issue #19).

I added some remarks to Mary's remarks (Mary please note) in <MR> tags.

Best regards
Michael

-----Original Message-----
From: Mary F. Fernandez [mailto:mff@research.att.com]
Sent: Friday, January 19, 2001 8:47 AM
To: Michael Rys
Cc: xmlquery
Subject: Michael Brundage's comments on Algebra


Hi Michael,

Please find below my responses to Michael Brundage's comments.
Issues are groups by subject and resolutions are tagged <Resolution>.
My remarks are tagged with <MF>.  Michael Rys' in <MR>.  Please
review, annotate, and respond to Michael Brundage and cc: 
w3c-xml-query-comments@w3c.org.

thx
mff

I. Issues related to lexical representation of literals:
========================================================

<Resolution>
  Added [Issue-0081:  Lexical representation of Schema simple types]
</Resolution>

  Issue #41: no boolean literals ( Section 3.6: Expressions, paragraph
  2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  high priority, core) There are no literal expressions for the
  boolean constants true and false.

  Issue #43: string lexical representation not defined ( Section 3.6:
  Expressions, paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) The lexical representation of strings is not
  defined. Please do not use the XPath definition (which is
  pathetic). Use a definition that allows both single- and
  double-quoted strings, and also allows escape sequences. I
  implemented string := '"' (''' | escape | ~['"\])* '"' | ''' ('"' |
  escape | ~['"\])* ''' escape := '\' any-character ; (the processed
  escapes are: \t, \n, \r, \\, \" and \' )

  Issue #44: number lexical representation not defined ( Section 3.6:
  Expressions, paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) The lexical representation of numbers is not
  defined. I implemented number := ('+' | '-')? digit+ ('.' digit+)?
  digit := '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'

  Issue #45: other lexical issues ( Section 3.6: Expressions,
  paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) Whitespace is not defined. I implemented whitespace
  := ' ' | '\t' | '\r' | '\n' Except for whitespace in string literals
  (which is of course preserved), whitespace is completely ignored.

  Issue #55: scalar literals not defined ( Section 3.6: Expressions,
  paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) Lexical representations for other scalar types
  (IDREFS, NMTOKENS, date, time, etc.) are not defined. How are scalar
  values of these types to be represented? My personal preference is
  cast 'a1 a2' as IdrefsScalar

  Issue #31: Constants not defined ( Section 3.6: Expressions,
  paragraph 3
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, syntax) Constants are not defined. What is the syntax of
  numeric constants?  String constants? Boolean constants? Others?
  Also, this is somewhat odd terminology, since for example the empty
  expression () is a constant.

II. Issues addressed by addition of operational semantics
=========================================================

<Resolution>
  Unless otherwise noted, these issues are resolved by the
  addition of Section 5 Dynamic/Operational Semantics.
</Resolution>

  Issue #11: comma operator ( Section 2.1: Data and types, paragraph
  10
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  high priority, core) The ampersand operator is defined, but the
  comma operator is not.  2.2 Projection
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>

  Issue #17: for is not a forest ( Section 2.4: Iteration, paragraph 7
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) This section says that the value of a for expression
  is a forest.  However, section 2.1 defines a forest to be a sequence
  of (zero or more) attributes and elements. Clearly the result of the
  expression for b in bib0/book do b/@year/data() fails to meet this
  definition. Please clarify the meaning of for.  
<MR>
  In my opinion, section 2.1 should defer to the datamodel
  definition of a forest which can be a sequence of value nodes
</MR>
<MF>
  Clarified in text and referred to data model & operational
  semantics.
</MF>

  Issue #26: empty() ( Section 2.6: Quantification, paragraph 5
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  normal priority, core) What is the result of empty( (), () ) ? In
  general, when do operators and functions flatten or unnest their
  arguments?
<MF>
  Clarified in data model and in operational semantics.
</MF>

  Issue #25: set operations are not defined ( Section 2.5: Selection,
  paragraph 9
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) What is the result of the expression 
    for b in bib0 do 
      where b/book/@year/data()<=2000 do 1? 
  Define the meaning for
  operators like less-than-equals on sets, forests, etc.  2.6
  Quantification
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>
  Defined in data model and in operational semantics.
</MF>

  Issue #52: automatic type coercions are not defined ( Section 3.6:
  Expressions, paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) Under what circumstances is this expression valid or
  invalid?  
    fun some_function ( a : Type1) : Type2 = 
      match a case x : Type3 do x else error
<MF>
  Operational semantics of match defines precisely.
</MF>

  Issue #53: order of evaluation is not defined ( Section 3.6:
  Expressions, paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) Order of evaluation of expressions is not defined
  (this is related to operator precedence, which is also not
  defined). Do and and or shortcut?  Is (a, b, c) equivalent to ((a,
  b), c) or (a, (b, c))?
<MF>
  Defined in operational semantics.
</MF>

III. Operational semantics of match
===================================

<Resolution>
  Unless otherwise noted, these issues are resolved by the
  addition in Section 5 Dynamic/Operational Semantics of semantics of
  match. 
  Added [Issue-0083:  Expressive power and complexity of match
expression]
  Additional comments follow.
</Resolution>

  Issue #24: match semantics ( Section 2.14: Structural recursion,
  paragraph 11
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  high priority, type system) How does match handle redundant cases?
  The description of match seems to imply that it is equivalent to a
  large if then else if then else if ... else block instead of a
  switch (because it takes the first matching branch). This will be
  terrible for performance. Also, the case statements appear to accept
  any type, including types not known at compile time (such as type
  variables or expressions involving type variables). Consequently,
  match becomes a run-time concept (in which case, I have to wonder
  what value it brings to your type system.) Again, this will be
  absolutely terrible for performance.  2.15 Functions for all
  well-formed documents
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>

<MF> 
  The values of all type expressions are known at compile time <MR>I assume
that you mean query-analysis time.</MR>,
  even the values of type variables, which are defined statically and
  are immutable.  The static semantics of match determines the most
  precise type for the expression.  The operational semantics defines
  precisely the run-time semantics of match, which compares the
  run-time type of the value to match and evaluates the case rules in
  order: the value of the first case rule that matches is returned.
  This is the simplest, *operational* semantics; it is not necessarily
  the most efficient implementation of match.  It is possible to
  implement match very efficiently. For example, pattern matching in
  ML is implemented efficiently.  We can short circuit the run-time
  evaluation of case rules whose static type are guaranteed not to
  match the match value.

  In addition, the user-level syntax in XQuery may choose to provide a
  restricted form of match to the user, i.e., one that does not permit
  arbitrary type expressions in the case rules.

  Consider the example :
 
  type exType = String | b [ Integer ]

  let  v = b [ 10 ] : exType do 
       match v 
       case v2 : String        do v2
       case v3 : b [ Integer ] do v3
       else a [ Float ]

  :    String | b [ Integer ]
  ==>  b [ 10 ]

  Note that the *type* of the match expression is String | b [ Integer
], 
  because at *query-analysis time*, the type of v is String | b [
Integer ]  
  even though its value has type b [ String ].  
  Even though the else clause returns a
  value with type a [ Float ], the static type rules eliminate this
  possibility because it can never match String | b [ Integer ].  At
  query evaluation time, the dynamic type of v is b [ Integer ],
  therefore at run time, it matches the second case rule and returns
  the value b [ 10 ].
</MF>

  Issue #49: match/case is very expensive ( Section 2.15: Functions
  for all well-formed documents, paragraph 12
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) The typeguard is very expensive. The fourth example
  in this section, html_of_xml, says that the first branch of the
  match checks whether the (run-time) value of x is a subtype of
  AnyScalar, and if not then the second branch checks to see if x is a
  subtype of AnyAttribute, and so on. In the worst case, this kind of
  type-checking requires traversing a structure as large as the
  instance of x. Consider for example: 
  fun not_fun ( x : AnyComplex ) : Integer = 
    match x 
    case t1 : AnyTree do 1  
    case t2 : AnyTree{0, 2}  do 2 
    case t3 : * [ * [ * [ * [ * [ * [ AnyAttribute ] ] ] ] ] ] do 3
    else error 3 
  Algebra Components
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
  3.4 Types : abstract syntax
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>

<MF>
  Regarding "In the worst case, this kind of
  type-checking requires traversing a structure as large as the
  instance of x."

  No, checking type subsumption and intersection requires traversing a
  structure as large as the *type* of x, not of the instance of x
  itself.  All values carry a representation of their type.  Call the
  runtime type of x, T. In the above example, there will be three type
  tests: (1) T /\ AnyTree != 0 (and if that fails then...)  (2) T /\
  AnyTree{0,2} != 0 (and if that fails then...)  (3) T /\ * [ * [ * [
  * [ * [ * [ AnyAttribute ] ] ] ] ] ].

  The operational semantics of match should make this clear.  

  <MR>I think Michael's worst case is based on the fact that you may not
know the type of x until you have traversed the instance (e.g., in streaming
applications). Michael, is this correct?</MR>
</MF>

IV. Issues addressed by definition of operators by joint
XSLT/Schema/Query task force on operators
=========================================================
<Resolution>
  The operational semantics defines the relationship of the Algebra to
  the Data model.  Other open issues are cleared identified in
  Introduction:

  [Issue-0018:  Align algebra types with schema]: The Algebra's type
system is not yet aligned with XML Schema. 
  [Issue-0056:  Operators on Simple Types]: A joint XSLT/Schema/Query
task force is chartered to define the operators on Schema simple types.
The Algebra will adopt the operators defined by that group. 
  [Issue-0081:  Lexical representation of Schema simple types]: The
lexical representation of Schema simple types must be defined for the
Algebra and XQuery. 

</Resolution>

General comments: The currently proposed XML Query Algebra fails to meet
even its own requirements; specifically, it does not use the XML Query
Data
Model or XSD type system. Essential language issues such as operator
semantics, operator precedence and variable scope are entirely ignored.

V. Issues related to alignment of XML Query Algebra's type system
with XML Schema 
===================================================================
<Resolution>
  Alignment with Schema is open issue:
  [Issue-0018:  Align algebra types with schema]: The Algebra's type
system is not yet aligned with XML Schema. 
</Resolution>

  Issue #1: type system ( Section 2.1: Data and types, paragraph 6
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, type system) The XML Query Algebra does not use the XSD
  type system, but instead uses its own homegrown type system (derived
  from the XDuce research project). This violates the requirements of
  the XML Query Language. Based on my experience with prototype
  implementations of the algebra, I claim that this type system is not
  necessary for an efficient implementation of the
  algebra. Furthermore, I will now provide several demonstrations that
  this type system is both ill-defined and incompatible with the type
  system of XSD.  In section 2.4, an example is given that constructs
  a new book element with different content than the book type already
  present in the type system. Because the XML Query type system
  differentiates types by name, a query cannot simultaneously use both
  book types. Moreover, there are issues in serializing out the
  results of such a query with its schema, because of the name
  collision. (You could try to work around this problem by
  auto-generating unique type names in the schema, but then you will
  not be able to roundtrip. You could work around this problem by
  creating a namespace for XML Query annotations in schemas, and
  adding annotations to the appinfo element in the XSD (or possibly
  attributes on the types themselves) that describes how to restore
  the original type names. But yuck, this is a terrible solution.)
  Also, not all types have a single name. Consider the simple example:

                for b in bib0/book do
                        b/author, b/title

  This algebraic expression is certainly well-formed, returning an
  expression with type author [ String ], title [ String ] . However,
  because this is a sequence type, it has no single (top-level
  element) name. Suppose we assign this expression to the variable
  v. Then what is the result of

                match v
                        case v1 : * do "case 1"
                        case v2 : *, * do "case 2"
                        case v3 : *, *, * do "case 3"
                        case v4 : author [ String ], title [ String ]
                        do "case 4"
                        else error

  Does the first case match? Or the second? Or does only the fourth
  case match? (to be completed)

  If you're going to use the XSD type system, use it exclusively. If
  the XSD type system isn't up to the challenge, get the schema
  working group to fix it while you still can. And finally, if all
  else fails, add or change only those things absolutely necessary to
  implement the XML Query Language.  Don't create yet another type
  system that implementors must support.  
<MR> 
  Our proposal is to use the XSD type system as implied by XSD and
  cooperate on the formalisation based on MSL.We see the type-related
  comments in this feedback also as potential input to the type
  formalisation work.
</MR>

  Issue #23: so how do types work, really? ( Section 2.4: Iteration,
  paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  high priority, type system) The return type of the first iteration
  is fully expanded into book[ author [ String ] { 1, *}, title [
  String ] ]{0, *} but in section 2.5 the return type is collapsed
  into Book{0,*}. How does the algebra system know which return type
  to use? It seems to me that in every heuristic imaginable, Book{0,*}
  is in-between the fully-expanded type and UrTree.  Your response is
  probably "it's Book because we declared it that way," in which case
  my follow-up question goes something like this: Consider a
  semi-redundant type definition such as type 
  Normal = (a [String ] | String | UrElement | UrTree) {0, *} 
  and an instance with this type
  let n : Normal = anything, it doesn't matter. Then what is the
  compile-time type of the expression 
  match n 
  case s : UrScalar do s
  case n : UrElement do n/* 
  else n?  

  If this example is not quite
  legal or is too trivial, tell me and I will come up with a better
  example that still demonstrates the same point.  2.5 Selection
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>


VI. Miscellaneous editorial issues
===================================

  0 Status
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
  Issue #3: Redmond, not Redmont ( Section 0: Status, paragraph 7
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  low priority, editorial) Redmond is misspelled Issue #4: accents (
<MF>Fixed.</MF>

  Section 0: Status, paragraph 11
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  low priority, editorial) These names are misspelled; they should
  contain accented characters (Fernandez, Simeon).
<MF>Fixed.</MF>

  Issue #13: compile-time vs. run-time ( Section 2.2: Projection,
  paragraph 8
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) You cannot use the words compile-time and run-time
  (which have important significance for implementations) without
  first defining them.  Furthermore, this is a different description
  than given previously for duplicates (where the distinction was made
  between the type of the expression, and the type of its value --
  which I guess you're trying to say are compile-time and run-time,
  respectively?). Also, the introduction of the notions compile-time
  and run-time has the appearance of constraining compliant
  implementations to compile first and then run, instead of executing
  XML queries in an interpreted way.
<MF>
  Changed "compile-time" to "query-analysis time" and "run-time" to
  "query-evaluation time" with clarification in text.  Also added
  [Issue-0084: Execution model].  
</MF>

  Issue #2: meaning of duplicate ( Section 2.2: Projection, paragraph
  3
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  high priority, editorial) I strongly object to the use of the word
  "duplicate" in this sentence. Duplicate has a precise meaning in
  XPath, which is different from its use here. An example of duplicate
  elements is the expression bib0/book/author, bib0/book/author which
  selects all the author elements twice. "Suciu", "Suciu" is an
  example of duplicate values, but not necessarily duplicate nodes.
  Changing the wording from "duplicate elements" to "duplicate values"
  is marginally better; my preference is to omit this phrase
  altogether (it adds nothing to the explanation of projection anyway,
  especially since the example already amply demonstrates the point,
  and therefore this phrase can only confuse the reader).
<MF>Fixed</MF>

  Issue #12: projection is simple? ( Section 2.2: Projection,
  paragraph 1
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  high priority, editorial) Since when is projection simpler than,
  say, addition? Also, this is an unusual use of the word simple,
  since you later decompose projection into other (more atomic, and
  therefore more simple) operations.
<MF>"The simplest operation" => "A basic operation..." </MF>

  Issue #5: prior work ( Section 1: Introduction, paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  low priority, editorial) The prior work list should reference
  XSL(T). Did tree grammars not influence the committee at all? Where
  is the influence of NRA on this algebra? I cannot detect it in this
  spec.
<MF>XSLT citation added. See "Laws" for reference to NRAs and
comprehensions.</MF>

  Issue #6: hand waving ( Section 1: Introduction, paragraph 3
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  normal priority, editorial) The claim that the algebra can capture
  many XML query languages is not substantiated. Which query
  languages, specifically, does it support?  Only many (not all)? If
  so, which ones are not supported and why?
<MF>Fixed.</MF>

  Issue #7: hand waving again ( Section 1: Introduction, paragraph 4
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  normal priority, editorial) The claim that types can detect certain
  kinds of errors is not substantiated. Which errors, specifically,
  does this type system detect?  2 The Algebra by Example
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>Fixed in intro.</MF>

  Issue #19: fst, snd are superfluous ( Section 2.9: Querying order,
  paragraph 3
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  high priority, core) The fst, snd elements in each pair are
  superfluous (arguably, so is pair).  That is, instead of
  index(book0/author) returning pair [ fst [ 1 ], snd [ author [
  "Abiteboul" ] ] ], ...  as described in the spec (section 2.9,
  Querying Order), I suggested that it should return 1, author [
  "Abiteboul" ], ...

  I believe you responded that the reason it doesn't do this is
  because of typing.  In fact, the spec seems to get the compile-time
  type of this expression wrong.  It says that the type of
  index(book0/author) is

  pair [ fst [ Integer ] , snd [ author [ String ] ] ] {1, *}
  
  but I believe it should be
  
  pair [ fst [ Integer ] , snd [ ??? ] {0, *}
  
  (where ??? is replaced with something like UrTree).  The type of the
  argument to index() cannot be reflected in the (compile-time) type of
its
  output.  In that case, index() might as well return an expression
  with type

  (Integer, ???) { 0, *}
  
  And of course, if the spec is referring to the runtime-type and
therefore is
  correct, then index() should still return
  
  ( Integer, author [ String ] ) {1, *}
  
  Either way, the pair/fst/snd elements are superfluous.
  
  Also, the spec does not say whether this function is deterministic,
and if
  so, whether the integers returned must be a 1-based index of the
elements in
  the set.
  2.12 Expanded names
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>
  The function index is builtin and therefore the Algebra can provide a
  more precise type rule that gives the most specific type for :
  
    index(book0/author) 
  
  which is:
  
    pair [ fst [ Integer ] , snd [ author [ String ] ] ] {1, *}
  
  Typically, when querying the local order of elements in a forest, the
  goal is to return, for example, the first element or some range of
elements. 
  Using the pair/fst/snd wrapper elements, it is trivial to return 
  the value associated with a particular index or range.  
  For example, to choose the first three authors:
  
    for v in index(book0/author) do 
      where v/fst/data() >= 1 && v/fst/data() <= 3 do v/snd/*
  
  Given your formulation, it is much more difficult to access the
  original value.  If you can explain how you would do so if
  index returns a value with type (Integer, author[String]){0,*}, that
  would be helpful.
</MF>

  Issue #57: variables are read-only ( Section 2.6: Quantification,
  paragraph 9
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) Local variables are read-only. Once a value has been
  assigned to a variable introduced with let, there is no grammatical
  expression that reassigns it. If the meaning of let is altered to
  say that it "introduces or reassigns a value," then how does one
  reassign a global variable (versus just declaring a local variable
  that hides the global one while the local is in scope)?  <MR>This
  comment shows that we have to make clear that these are not
  variables in the procedural programming language sense</MR>
<MF> 
  The Algebra is a functional language and does not provide an
  assignment operator. Semantics of the let expressions is clarified
  in text and in operational semantics.
</MF>

  Issue #22: non-XPath compatible wildcard semantics ( Section 2.12:
  Expanded names, paragraph 11
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) Making e/a shorthand for e/{*}a is a big
  mistake. You're using the same syntax as XPath but with a markedly
  different semantics. The expression e/a should have the same meaning
  it has in XPath, which is e/{Null}a. (XPath Spec, section 2.3,
  paragraph 3: "... if the QName does not have a prefix, then the
  namespace URI is null".)
<MF> 
  Good point.  Fixed.
</MF>

  Issue #56: path order different from XPath ( Section 4.1: Relating
  projection to iteration, paragraph 1
  <http://www.w3.org/XML/Group/2000/11/xmlquery-algebra20001204.html>;
  low priority, core) In XPath, the set value of a path is always
  ordered in document order or reverse-document order. In the XML
  Query Language, set order is arbitrary. Consequently, the meaning of
  the expression e/a can differ from its meaning in XPath. This will
  be especially troublesome when you add reverse axes; conversely,
  correctly implementing the reverse axes on top of the XML Query
  Language may be very difficult.
<MF> 
  Although the Algebra and XPath both have a '/' operator, it clearly
  does not have the same semantics in each language nor was the
  intention for both of them to have the same semantics.  A joint
  XSLT/Query task force is defining the semantics of XPath 2.0 in
  terms of the Algebra.  This effort will make the semantics of the
  XPath '/' operator precise.
</MF>

  Issue #32: wrong not equal operator ( Section 2.6: Quantification,
  paragraph 7
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  normal priority, code example/sample) The third and fourth examples
  incorrectly use operator <> instead of operator !=.
<MF>Fixed.</MF>

  Issue #33: common idioms obfuscated ( Section 2.6: Quantification,
  paragraph 1
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  high priority, core) This algebra obfuscates common idioms. For
  example, the XPath x/y[@z=1] is a very common query. In this
  algebra, this query becomes for b in x/y do where b/@z/data()=1 do b
  when it's really select x/y where data(@z) = 1.
<MF>
  This comment is unclear.  The purpose of the Algebra is *not*
  to define a concise syntax for common idioms in XPath nor for common
  idioms in any query language.  Its purpose is to define precisely
  operators that can be used to define the expressive semantics of
  very concise XPath or XQuery expressions.  The Algebra
  is *not* the user-level language: XQuery serves that role.
</MF>

  Issue #58: variable scoping is not defined ( Section 2.6:
  Quantification, paragraph 9
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) The scope of local variables is not defined. Is the
  expression let v = e do let v2 = v do let v = e2 do v2 legal, and if
  so, what is its value?
<MF>Fixed.  Also defined precisely in operational semantics.</MF>

  Issue #9: forest definition ( Section 2.1: Data and types, paragraph
  8
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, type system) Is a forest ordered? The term sequence is
  used but has not been defined. In any case, these definitions belong
  in the data model spec, not in the algebra.
<MF>Resolved in data model document</MF>

  Issue #10: is it a forest? ( Section 2.1: Data and types, paragraph
  9
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  high priority, type system) Does the book type contain a forest? A
  forest of what? How do the all-grouped attributes fit into this?
<MF>Fixed</MF>

  Issue #14: attribute type ( Section 2.3: Atomic data, paragraph 4
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  normal priority, code example/sample) What is the type of
  book0/@year? Presumably @year[ Integer], but then you should give
  this example before book0/@year/data().  2.4 Iteration
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>Fixed</MF>

  Issue #18: for and where grammars introduce ambiguity ( Section 2.5:
  Selection, paragraph 1
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  high priority, core) The grammar for nested for or where clauses
  introduces ambiguity (this is the common if-then-else problem). For
  example, consider the expression 
  for b in bib0/book do 
    for y in b/author do 
      where b/@year/data()<=2000 do 
        b 
  in which it is not
  immediately clear whether the where clause applies to the inner or
  the outer for loop. Although the greedy subrule does what most
  people expect in this case, there are variations in which it does
  not. (Again, this is the well-known if-then-else phenomenon.)

  Issue #59: execution model is not defined ( Section 2.6:
  Quantification, paragraph 9
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) The execution model is not defined. For example,
  what is the legal outcome of an infinitely recursing expression?
  Error? System lock-up? What happens when an undefined variable is
  used? When an undefined function is called? When a function with
  that name and number of arguments is defined, but the argument types
  do not match? Etc., etc.  2.7 Join
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>
  Added [Issue-0084:  Execution model].
</MF>

  Issue #20: strange example ( Section 2.12: Expanded names, paragraph
  3
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  normal priority, code example/sample) This example qualifies all the
  attributes with the same namespace as their parent. In practice,
  this is extremely unusual; attributes are usually placed in a
  namespace only when they are annotations from a different namespace
  than the element's. Supply the XML to which this example
  corresponds, and remember that attributes by default do not inherit
  the namespace of their element.
<MF>Fixed.</MF>

  Issue #21: wildcard types cannot be implemented ( Section 2.12:
  Expanded names, paragraph 11
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) If x!y means any name in x except names in y, what
  does x!y!z mean?  In general, how do ! and | operate (precedence,
  associativity)? Parentheses are required to force the desired
  grouping of these two operators. Also, what does x!* mean? (There's
  an infinite family of such examples.)
<MF>Added Issue-0085 : Semantics of Wildcard type</MF>

  Issue #29: Use empty namespace to indicate Null ( Section 2.12:
  Expanded names, paragraph 1
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  normal priority, syntax) Use empty brackets {} instead of {Null} to
  denote the empty namespace.
<MF>Fixed.</MF>

  Issue #38: namespace URIs should be strings ( Section 2.12: Expanded
  names, paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) All of the expanded name examples using URIs look
  like {http://uri}localname. The URI should be a string-valued
  expression, both so that it can be computed (as suggested by the
  grammar) and so that the expression can be readily parsed.  2.14
  Structural recursion
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>Fixed</MF>

  Issue #30: Empty choice symbol is strange ( Section 3.4: Types :
  abstract syntax, paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  normal priority, syntax) The empty choice symbol appears to be O
  (entity &#xD8;). Is this correct?  
<MR>This is nice to type but not nice for building a prototype
implementation.</MR>
<MF>Yes.</MF>

  Issue #35: type operator precedence undefined ( Section 3.4: Types :
  abstract syntax, paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) Type operator precedence is not defined (or
  inferrable from the grammar). I implemented repetition precedes
  choice precedes all precedes sequence (so sequence binds last) 3.6
  Expressions
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>
  Added Issue-0082 : Type and expression operator precedence
</MF>

  Issue #36: parentheses not in type grammar ( Section 3.6:
  Expressions, paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) The type grammar omits parentheses as a grouping
  operator, but the HTML_body example (third example in section 2.15)
  uses parentheses to group.

<MF>
  Fixed in "Types : concrete syntax"
</MF>

  Issue #37: expression operator precedence undefined ( Section 3.6:
  Expressions, paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) Operator precedence is not defined (or inferrable
  from the grammar) for expressions. I implemented projection precedes
  type precedes binary-op precedes sequence (so sequence binds
  last). Among the binary operators, I implemented (highest precedence
  to lowest): additive (+ -), relational(< > <= >=), equality(= !=),
  and, or.
<MF>
  Added Issue-0082 : Type and expression operator precedence
</MF>

  Issue #39: optional query keyword ( Section 3.6: Expressions,
  paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  high priority, core) Is query optional? None of the examples use it,
  and it seems like a query fragment should default to being a query.

  Issue #40: operator semantics not defined ( Section 3.6:
  Expressions, paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) None of the operator semantics are defined. What is
  2 + "2"? 2 > "1"? Etc., etc.  
<MR>
  This is mainly a reminder that this area needs urgent attention
</MR>
<MF>
  See Issue-0056 : Operators on Simple Types
</MF>

  Issue #42: missing arithmetic operators ( Section 3.6: Expressions,
  paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) There are no operators for multiplication, division,
  or modulo.
<MF>
  See Issue-0056 : Operators on Simple Types
</MF>

  Issue #46: name collisions ( Section 3.6: Expressions, paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) Name collisions are not explained. For example: Can
  a variable and a function have the same name?
<MF>
  Added Issue-0046 : Syntactic rules 
</MF>

  Issue #47: reserved keywords ( Section 3.6: Expressions, paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) Are all keywords reserved? If so, then how does one
  use them in path expressions, like error/@name? If not, then what
  are the rules to disambiguate (for example, how to distinguish the
  error keyword from an error identifier)?
<MF>
  Added Issue-0046 : Syntactic rules 
</MF>

  Issue #48: sort is wrong ( Section 3.6: Expressions, paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) Sort is described as a built-in function in the
  grammar, but as an expression sort Var in Expr by Expr in section
  2.10 Sorting.
<MF>Fixed</MF>

  Issue #50: error does not fit into the type system ( Section 3.6:
  Expressions, paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) Supposedly every expression has a well-defined
  compile-time and run-time type. What is the type of error? Several
  examples in the spec look like fun some_function ( a : Type1) :
  Type2 = match a case x : Type3 do x else error suggesting that
  whatever the type of error is, it is compatible with every definable
  type.
<MF>
  The type of error is empty (Ø).  For its relationship to other
  types, see subsection : Equivalences and Subtyping.  An implementation
is
  free to decide how to evaluate the error expression : for example,
  it could raise an exception in the given programming environment.
</MF>


  Issue #54: function names should be computable ( Section 3.6:
  Expressions, paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  normal priority, core) Constructor names are computeable, but
  function names are not.
<MF>
  The Algebra is a first order functional language, i.e., functions
  are not first-class values.  A function's value is determined at
  query-analysis time, therefore it is not possible to select or to
  compute dynamically a function to apply.
<MF>

  Issue #61: wrong grammar for expanded names ( Section 3.6:
  Expressions, paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  high priority, core) The grammar definition {Exp}Exp appears to be
  incorrect. Or is { a, b} () a legal expanded expression?  4
  Equivalences
  <http://www.w3.org/XML/Group/2000/11/xmlquery-algebra20001204.html>
  4.1 Relating projection to iteration
  <http://www.w3.org/XML/Group/2000/11/xmlquery-algebra20001204.html>
<MF>
  The type rule for {Exp1}Exp2 requires that Exp1 be a URI expression
  and Exp2 be an NCName.  
</MF>


VII. Issues related to absence of join operator
===============================================
<Resolution>
  Added [Issue-0087 : More examples of "joins"]

  Added citation on join evaluation.
  In particular, see:

  @article{graefe93,		  
  title="Query Evaluation Techniques for Large Databases",
  author="Goetz Graefe",
  journal="{ACM} Computing Surveys",
  volume="25(2)",		  
  pages="73--170",
  year=1993,
  month="June"}		  
</Resolution>

  Issue #27: no join operator ( Section 2.7: Join, paragraph 1
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  normal priority, core) There is no join operation in this
  algebra. Joins are effected through nested for loops and possibly
  applying selection criteria to express the join. Implementations
  must analyze a query to determine if/when joins occur and which
  selection critieria (if any) are join conditions. However, this
  algorithm is not addressed or even acknowledged by the
  specification.

  Issue #28: crossproduct instead of a join ( Section 2.7: Join,
  paragraph 5
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) This section describes a diagonal of a
  cross-product, but the result structure does not reflect the
  join. Instead, it's just a filtered cross-product. When the join is
  one-to-many or many-to-many, the results will contain one element
  for every member of the (predicated) cross-product, instead of one
  member for each element/attribute in the outer forest.  The given
  example returns, for each distinct (book, review) pair whose titles
  match, one element. A result that reflects the join would return one
  element for each book, containing data from each review for that
  book (books joined with reviews).  Also, I want an example of a
  many-to-many join. What if multiple books matched multiple reviews
  (a many-to-many join)? The outer for loop iterates over one book at
  a time, so the result structure is the same as for a one-to-many
  join between books and reviews.

  Issue #60: many-many join not possible? ( Section 2.7: Join,
  paragraph 1
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  critical, core) Please demonstrate a many:many join. The example
  demonstrates a 1:many join. Demonstrate other types of joins, too;
  e.g., contrast an "inner join" with an "outer join".  
  2.9 Querying order
    <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MR> 
  I would like to extend this issue by adding that formulations of
  full outer-joins seem to be problematic 
</MR>
<MR>We currently work on adding more examples for the different kind 
of joins (see Peter Fankhauser's mail on this topic)</MR>

VII. Issues noted for future consideration
=========================================

  Issue #15: types everywhere ( Section 2: The Algebra by Example,
  paragraph 1
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  high priority, editorial) Readers will expect "The Algebra by
  Example" to focus on the algebraic operations, their syntax and
  semantics. Instead, the focus is on the type system throughout. For
  example, the sections on projection and iteration devote more
  paragraphs to explaining the types of their expressions than to the
  description of the expressions themselves. Put the type system in
  its own section, after The Algebra By Example. This change would
  vastly improve the flow of this specification.  2.1 Data and types
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>
  This is a very helpful suggestion, but we cannot implement this change
  in the next draft.  It will be considered for a future draft.
<MF>

  Issue #16: describe wildcard during projection ( Section 2.2:
  Projection, paragraph 1
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  normal priority, editorial) After resolving issue 15, you should
  also enhance the description of projection to include namespaces,
  wildcards, etc.  2.3 Atomic data
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>
<MF>
  Future examples
</MF>

  Issue #51: no typeof() operator ( Section 3.6: Expressions,
  paragraph 2
  <http://www.w3.org/XML/Group/2000/12/xmlquery-algebra20001204.html>;
  high priority, core) There is no way to query the (arbitrary)
  run-time type of an expression. The algebra needs to be extended to
  allow expressions like: 

  fun some_function ( a : Type1, b : Type2) : Type3 = 
    match a case x : typeof(b) do x 
    else error 

  Note that this is not equivalent to 

    fun some_function ( a : Type1, b : Type2) : Type3 = 
    match a case x : Type2 do x else error

<MF>
  The data model accessor type() returns the run time type of a data
  model value.  Currently, there is no operator to compare two dynamic
  type expressions.
</MF> 

-- 
Mary Fernandez                    AT&T Labs - Research          
Principal Technical Staff Member  180 Park Ave., Bldg 103, E243 
mff@research.att.com		  Florham Park, NJ 07932-0971   
http://www.research.att.com/~mff  973-360-8679 FAX: 973-360-8187

Received on Wednesday, 7 February 2001 20:52:44 UTC