W3C home > Mailing lists > Public > public-xml-processing-model-wg@w3.org > December 2007

[closed] Re: MSFT presentation, feedback

From: Norman Walsh <ndw@nwalsh.com>
Date: Wed, 12 Dec 2007 15:26:46 -0500
To: public-xml-processing-model-wg@w3.org
Message-ID: <m2r6hrzmk9.fsf@nwalsh.com>
Interesting and useful, but not in V1.

/ ht@inf.ed.ac.uk (Henry S. Thompson) was heard to say:
| I gave a presentation on the Last Call WD to some of the XML people at
| MSFT on Friday, and got a pleasantly positive reception.
|
| One specific question they were interested in, looking towards very
| large scale data processing with parallel hardware available, was
| whether we supported Google 'map-reduce' style decomposition.  I
| mentioned the inherent parallelisability of the overall architecture,
| but realised we did not have anything which would directly support
| such decomposition.  Maybe we should consider it. . .
|
| We already have 'map' -- it's just for-each with a select pattern on
| its input.
|
| Here's an example of how it could be used along with a new 'reduce'
| construct:
|
| Stipulate we have a pipeline which can construct an index for a book
| chapter.  Here's how we index the whole book:
|
|  <for-each select='//chapter'>
|   [compute index]
|  </for-each>
|
|  <reduce name='r'>
|   <input port="seed">
|    <inline>
|     </bookIndex>
|    </inline>
|   </input>
|
|   <merge-two-indices>
|    <input port='book'>
|     <pipe port='seed' step='r'/>
|    </input>
|   </merge-two-indices>
|
|  </reduce>
|
| where merge-two-indices has two inputs, primary a chapter index and
| secondary a book index, and one output, a new book index merging in
| the chapter index.
|
| reduce takes a primary sequence input and a secondary single input
| (the seed) and a subpipeline.  It runs the subpipeline repeatedly,
| supplying each member of the sequence in turn as the default input and
| first time the seed, and subsequent times the output of the previous
| round, as the 'seed' input.  Output is the output of the subpipeline
| From the last iteration.
|
| Such a construct would give us a way of addressing our current lack of
| open-ended/runtime input/output cardinality.
|
| ht
| --
|  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
|                      Half-time member of W3C Team
|     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
|             Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
|                    URL: http://www.ltg.ed.ac.uk/~ht/
| [mail really from me _always_ has this .sig -- mail without it is forged spam]

                                        Be seeing you,
                                          norm

-- 
Norman Walsh <ndw@nwalsh.com> | Do not condemn the judgement of another
http://nwalsh.com/            | because it differs from your own. You
                              | may both be wrong.-- Dandemis

Received on Wednesday, 12 December 2007 20:27:02 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:21:54 GMT