Fixed Windows use case from Innovimax SARL on 2007-05-07 (public-xml-processing-model-wg@w3.org from May 2007)

From: Innovimax SARL <innovimax@gmail.com>
Date: Mon, 7 May 2007 20:06:34 +0200
To: "XProc WG" <public-xml-processing-model-wg@w3.org>
Message-ID: <546c6c1c0705071106x1a3213a4ye6c2b752e61f0a1b@mail.gmail.com>
Dear,

Some small thoughts

Today, we have the ability with for-each or view-port to make the
processing of a big document possible

Say I have
<doc>
  <chapter>
  </chapter>
  <chapter>
  </chapter>
  :
</doc>

with sufficiently big chapter to be processed one by one, I can do
<p:viewport match="chapter">
   <my:funky-process>
   </my:funky-process>
</p:viewport>


But imagine another real use case where I have similar structure
<div>
  <block>
  </block>
  <block>
  </block>
  :
  <block>
  </block>
</div>
but with block beeing small but numerous

If I want to process them by group, I cannot do that simply today
I have the choice to

<p:viewport match="block">
   <my:funky-process>
   </my:funky-process>
</p:viewport>

and loose the connection between each block. Or to

<p:viewport match="div">
   <my:funky-process>
   </my:funky-process>
</p:viewport>

and process a hugh file.

What I want is to be able to group some of the blocks and process them

For that, I want the <p:wrap> to be able to do that

In that case I would do

<p:wrap>
  <p:option name="name" value="my:wrapper"/>
  :
</p:wrap>
<p:viewport math="my:wrapper">
   <my:funky-process-grouped>
   </my:funky-process-grouped>
</p:viewport>
<p:unwrap>
  <p:option name="match" value="my:wrapper"/>
</p:unwrap>


Here are some ideas :
* group-adjacent-matches (true|false; default=false)
* include-ignored-nodes (XPath Pattern; default='/..'; or may be
self:text()[string-length(normalize-space(.))=0])
* break-before-match (XPath Pattern; default='*')
* break-after-match (XPath Pattern; default='*')

The process is the following :
Let's process the document in document order
[1] If the current node match the value of the pattern on the "match" option
Then :
* if group-adjacent=false, wrap the current node with the element with
QName given by the "name" option and go to step 1
* if group-adjacent=true, wrap the current node with the element with
QName given by the "name" option and the following sibling until
** this node does not match the pattern in "match"
** or this node does not match the pattern in "include-ignored-nodes"
** or this node does match the pattern in "break-before-match" (in
case this node is excluded)
** or this node does match the pattern in "break-after-match"
etc..



I propose to have available to the last three option evaluation :
1) the context which is the current match
2) $p:previous-match the last matched node
3) $p:match-count the number match element that have been processed
3) $p:current-subgroup-in-match-count where
4) $p:ignored-nodes-in-match-count

In the previous example I could do

<p:wrap>
  <p:option name="name" value="my:wrapper"/>
  <p:option name="match" value="/div/block"/>
  <p:option name="group-adjacent-matches" value="true"/>
  <p:option name="include-ignored-nodes"
select="self:text()[string-length(normalize-space(.))=0]"/>
  <p:option name="break-before-match"
select="block[$p:current-group-count > 10]"/>
  <p:option name="break-after-match" select="/.."/>
</p:wrap>

Another example is a big plan document
<body>
  <h1>...</h1>
  <p>.......</p>
  <h2>...</h2>
  <p>.......</p>
  <h1>...</h1>
  <p>.......</p>
  <h2>...</h2>
  <p>.......</p>
  :
</body>

<p:wrap>
  <p:option name="name" value="section1"/>
  <p:option name="match" value="/div/*"/>
  <p:option name="group-adjacent-matches" value="true"/>
  <p:option name="include-ignored-nodes"
select="self:text()[string-length(normalize-space(.))=0]"/>
  <p:option name="break-before-match" select="h1"/>
  <p:option name="break-after-match" select="/.."/>
</p:wrap>

How do you feel with this proposal

Mohamed

-- 
Innovimax SARL
Consulting, Training & XML Development
9, impasse des Orteaux
75020 Paris
Tel : +33 8 72 475787
Fax : +33 1 4356 1746
http://www.innovimax.fr
RCS Paris 488.018.631
SARL au capital de 10.000 €
Received on Monday, 7 May 2007 18:06:41 UTC