Re: creation order vs. document order from Jan Hidders on 2003-09-24 (www-ql@w3.org from July to September 2003)

From: Jan Hidders <jan.hidders@ua.ac.be>
Date: Wed, 24 Sep 2003 16:25:01 +0200
To: www-ql@w3.org
Message-ID: <3F71A93D.2020202@ua.ac.be>
Peter Fankhauser wrote:

> Here's my try about what the formal semantics (and IPSI-XQ
> as one of the implementations of XQuery that are closely aligned
> with the FS - at least for static reasoning).

Yes! Finally somebody who knows and understands the formal semantics. I 
was already beginning to fear that my assignment to write a formal 
specification of the XQuery semantics that the good people at ICDT or 
PODS would accept as a proper formal definition was becomming a "mission 
impossible". :-)

> Guided by the normalization rules in the FS,
> IPSI-XQ rewrites/normalizes
> 
> (<x><y id="1"/><y id="2"/></x>)/y

Oops, I just saw that I forgot the "," and that the /y is actually not 
relevant for the question.

> as follows:
> 
>       http://www.w3.org/TR/query-semantics:distinct-doc-order(let
> $fs:sequence := (element x{
>         element y{
>           attribute id{
>             "1"          }
>         }
> , element y{
>           attribute id{
>             "2"          }
>         }
>       }
> )
>         return
>           let $fs:last := count($fs:sequence)
>             return
>               for $fs:dot at $fs:position in $fs:sequence
>                 return
>                   child::y)
> 
> The interesting part happens in the expression
> 
>   element y{ attribute id{"1"}},
>   element y{ attribute id{"2"}}          }
> 
> This is a "core sequence" expression;
> see: http://www.w3.org/TR/xquery-semantics/#sec_constructing_sequences
> 
> The dynamic semantics is given by:
> 
> "The dynamic semantics of the sequence expression follows. Each
> expression
> in the sequence is evaluated and the resulting values are concatenated
> into one sequence."
> 
> dynEnv |- Expr1 => Value1     dynEnv |- Expr2 => Value2  
> -------------------------------------------------------
> dynEnv |- Expr1, Expr2 => Value1, Value2
> 
> applied to our sequence expression we thus have:
> 
> dynEnv |- Expr1 => element y{ attribute id{"1"}}
> dynEnv |- Expr2 => element y{ attribute id{"2"}}  
> ------------------------------------------------------------------------
> -------------
> dynEnv |- Expr1, Expr2 => element y{ attribute id{"1"}},element y{
> attribute id{"2"}}   
>  
> The actual order (not the document order) appears to be well-defined
> (maybe too well;).

Ok. Until here I completely follow and understand what happens.

> Next, we construct the element:
> 
> http://www.w3.org/TR/xquery-semantics/#doc-xquery-ElementConstructor
> 
> Element construction is a rather complex beast, because a lot of
> special cases to deal with mixed content, typed vs. untyped values
> etc. needs to be taken into account, and a lot of implict processing
> for namespaces, validation etc. needs to take place. In your example,
> IPSI-XQ simplifies a bit, because the 
> fs:item-sequence-to-node-sequence is an identity transformation in
> this case.

Really? If I read

    http://www.w3.org/TR/2003/WD-xquery-20030502/#id-computedElements

which defines the semantics of this function, it says: >> For each node 
returned by the content expression, a new deep copy
>> of the node is constructed, including all its children, attributes,
>> and namespace nodes (if any).

It this is really the semantics then it is never the identity function. 
Are you sure that if you let it be the identity function in some cases 
you never run the risc of constructing fragments that share nodes?

> Anyway, the input order to the constructed x-element is
> not changed, and by becoming the content of an element, the 
> y-elements are now in document order within x.

Yes, but this document order is not necessarily the sequence order. If 
we assume that we used the identity function for the content 
construction then this is the original document order of the y element 
nodes that was established when they were created, which we know to be 
application dependent. But also if we do not assume the identity 
function then the document order over the y element nodes is application 
dependent and established at the moment when the deep copies are made. 
So in both cases the resulting document order is independent of the 
original sequence order and application dependent.

Ultimately that means that at the moment the current formal semantics 
say that a possible result of the expresssion

    <x> <y id="1"/>, <y id="2"/> </x>

is

    <x> <y id="2"/> <y id="1"/> </x>

My suggestion would be to fix the semantics of the function 
fs:item-sequence-to-node-sequence by declaring that the document order 
over the copied nodes must be chosen such that it reflects the sequence 
order of the original nodes in the result of the content expression.

-- 
    Jan Hidders

  .---------------------------------------------------------------------.
  | Post-doctoral researcher               e-mail: jan.hidders@ua.ac.be |
  | Dept. Math. & Computer Science         tel: (+32) 3 218 08 73       |
  | University of Antwerp                  fax: (+32) 3 218 07 77       |
  | Middelheimlaan 1, BE-2020 Antwerpen, BELGIUM     room: G 3.21       |
  `---------------------------------------------------------------------'
Received on Wednesday, 24 September 2003 10:24:34 UTC