W3C home > Mailing lists > Public > public-qt-comments@w3.org > September 2010

[XQuery 1.1] XML-mapping support and explicit parallel FLWOR

From: Semyon Chertkov <semyonc@gmail.com>
Date: Tue, 21 Sep 2010 12:36:04 +0400
Message-ID: <AANLkTimVUc8AYG40UNFTSSr+TwEKPvz81bK2akLw0gLg@mail.gmail.com>
To: public-qt-comments@w3.org
I'd like to suggest incorporation of some features into the new standard.
The usefulness of these new features was confirmed empirically in
my implementation of QueryMachine.XQuery.

1). I suggest to include XML-files mapping support into the XQuery language..
This task is often occurs in practice. Although XQuery syntax allows
implementation of the
mapping by means of standard tools, the code gets unreadable and difficult
to maintain.
It implementation always requires the use of XSL.

To make mapping more convenient, I offer to expand Direct Element
Constructors
syntax as described in the following grammar:

*DirElemConstructor ::= "<" QName  MappingExpr? DirAttributeList
    ("/>" |   (">" DirElemContent* "</" QName S? ">"))        [96']
MappingExpr          ::= [ PathExpr ]                            [96'']*

An optional argument MappingExpr is added into the constructor syntax.
In case this expression is present XQuery processor implicitly
implements FLWOR in
the constructor using PathExpr as iterator. Loop variable current value is
set as context
for expression inside the constructor.

Here are a few examples:

<a [(1 to 3)]>{.}</a>
=>
 <a>1</a>
 <a>2</a>
 <a>3</a>

<a[()]>..</a> => empty

Let's take a look at a more complex case.
Assume that we are having here the following XML data file:

<?xml version = "1.0" encoding = "utf-8"?>
<doc>
   <a>text1</a>
   <b>text2</b>
   <table>
       <row>
          <column1>value1</column1>
          <column2>value2</column2>
      </row>
      <row>
         <column1>value3</column1>
         <column2>value4</column2>
      </row>
   </table>
</doc>

which needs to be transformed into some other document

<root>
   <field1>text1</field1>
   <detail atr1="text2">
      <line>
           <A>value1</A>
           <B>value2</B>
     </line>
     <line>
          <A>value3</A>
          <B>value4</B>
     </line>
   </detail>
</root>

Here's XQuery request we have to write in order to perform the above task

let $doc := doc("data1.xml")/doc
return
  <root>
  {
     for $v in $doc/a
        return <field1> { $v/text() } </field1>,
     <detail>
     {
         for $a in ($doc/b)[1]
            return attribute atr1 { $a/text() },
         for $r in $doc/table/row return
            <line>
            {
                 for $v in $r/column1
                    return <A>{ $v/text() } </A>,
                 for $v in $r/column2
                    return <B>{ $v/text() } </B>

            }
            </line>
      }
      </detail>
  }
  </root>

A bulk of for-statements attributes to the fact that mapping programs do not
usually create
any elements in a destination file if the source file doesn't contain
any appropriate source element.

The suggested simplified syntax allows writing the above request more
concisely:

 <root [doc("data1.xml")/doc]>
   <field1[a]>{text()}</field1>
   <detail>
      {
         for $a in b[1]
            return attribute atr1 { $a }
      }
      <line[table/row]>
           <A[column1]>{text()}</A>
           <B[column2]>{text()}</B>
     </line>
   </detail>
</root>

This form of mapping notation allows making the requests to be more
transparent than
embedded *for*-loops and not to complicate the code with temporary variables
as
values are passed by means of context and mapping expressions are assigned
directly to required constructors.

2) I also suggest including parallel *for* operator into XQuery language for
explicit paralleling FLWOR-expressions in multicore processors.
The use of complex analysis algorithms concerning small and compact embedded
libraries is not always reasonable,  however explicit parallelization at
request level
sometimes yields very good results.

Here's an XMark Q11 test sample:

let $auction := doc("auctions.xml") return
for $p in $auction/site/people/person
  let $l :=
     for $i in $auction/site/open_auctions/open_auction/initial
        where $p/profile/@income > 5000 * exactly-one($i/text())
    return $i
return <items name="{$p/name/text()}">{count($l)}</items>

Using paralleling statement:

let $auction := doc("auctions.xml") return
*parallel *for $p in $auction/site/people/person
  let $l :=
     for $i in $auction/site/open_auctions/open_auction/initial
        where $p/profile/@income > 5000 * exactly-one($i/text())
    return $i
return <items name="{$p/name/text()}">{count($l)}</items>

I'm currently experimenting with this XQuery extension in our
implementation.
It allows 2-3 times speed up of query execution.

Thanks,Semyon A. Chertkov
http://www.wmhelp.com
Received on Tuesday, 21 September 2010 08:38:41 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:15:06 GMT