Unifying Iteration and Viewports

In the proposal we had iteration using the "bang operator" (!).  For
example, if we have a sequence of documents in $seq and want to apply
a single "filter step":

   $seq ! filter()

which is nice because the left hand side is nominally always a
sequence and the semantics are clear in that the step signature (or
flow expression) must have a single input and single output.  As such,
the result is a sequence of the same length.

This is opposed to some syntax for a "for loop" where the inputs and
outputs need to be declared

  for $x in $seq output $y  $x → step() ≫ $y

and block expressions become more difficult to bind:

  for $x in $seq output $y  $x → [$in,$out] { ... complicated stuff ... } ≫ $y

I don't see the value in the above and we can't really apply "Henry's
rule" where the body of the for loop should be able to be a single
step invocation.


The "replace" or "viewport" has several issues with a single operator.

1. We need to always treat the left hand side as a sequence.  We can
ignore the issue of whether there is implicit iteration for now
(probably shouldn't do that).

2. When the left hand side is a sequence produced by some
select/projection operation, the "book keeping" of the original
locations in the document are lost.

3. The operator needs an expression (e.g., the pattern to select)

4. The expression might be dynamically computed.

In the above, (4) is a new requirement that we didn't address in the
previous XProc 2.0 draft.  It feels like something we should consider.

IMHO, we could do this well if select/projections were in the language
as a native construct:

Straw man syntax:

(A) An iteration of the sections that produces a sequence:

   $doc//section ! filter()

(B) An iteration of the sections that produces an empty sequence:

   $doc//section ! sink()

(C) replacement of sections

  $doc ~ //section ! filter() ~

  The replacement always starts with something that generates a
sequence but we use the tilde to indicate where the bookkeeping starts
and ends.

  This would be identity (or an error):

  $doc ~ //section ~

  This would be an error:

  $doc ~ filter() ~

(D) replacement with a step chain (a nice shorthand)

  $doc ~ //section ! A() → B() → C() ~

(E) replacement with a block expression:

  $doc ~ //section ! [$in,$out] { $in → A() → B() → C() ≫ $out } ~

(F) replacement that effectively deletes:

  $doc ~ //section ! [$in,$out] { $in → A() → B() → C()  } ~

   We could allow this or make it an error as we do in XProc 1.0


Comments:

1. The projections in (A) and (B) can be done with a step:

   $doc → select("//section") ! filter()

2. Using a step for projections allows the expression to be computed:

   $params → select("/option[@name='target']/@expr") ≫ $expr
   $doc → select($expr) ! filter()

3. We could compute expressions in replacements using a step for projections:

   $doc ~ select($expr)  ! filter() ~

   but now that select step has special semantics as the start of the
replacement.

4. Using step syntax for replacement is confusing (as we heard on the
call today).

5. Having projections in the language as a native syntax is predicated
on having a expression language natively in the language (see other
discussion on that).

-- 
--Alex Miłowski
"The excellence of grammar as a guide is proportional to the paucity of the
inflexions, i.e. to the degree of analysis effected by the language
considered."

Bertrand Russell in a footnote of Principles of Mathematics

Received on Wednesday, 16 March 2016 19:10:19 UTC