- From: Alex Miłowski <alex@milowski.com>
- Date: Wed, 16 Mar 2016 13:21:36 -0700
- To: XProc WG <public-xml-processing-model-wg@w3.org>
Historically, viewports was something we decided we needed to deal with large documents for streaming purposes. It has heritage in various work done previously at Edinburgh, Markup Technologies, and my own smallx pipeline engine. All three of those had some ability to stream documents and viewport. I implemented a streaming subset of XPath as well. I think the use case for viewports can mostly be associated with various XML-related tasks. Many of those tasks can be accomplished by XSLT or XQuery but require loading the whole document. It is worth noting that viewports, as currently implemented by most (if not all) XProc processors, do not stream. A viewport is essentially a special case of iteration. We can't easily do the book keeping in a pipeline but I suspect there are clever ways to do it without special language support. Meanwhile, a use might just find it easier (and faster) to just use a big hammer like an XSLT identity transformation with a few additional rules. Yes, it will use more memory but ... see above about no streaming XProc processors. Also, we can't easily apply "Henry's rule" without more syntax. While I think the idea of view porting is a good one, it may not find general use for a larger variety of data formats. Many of the data driven tasks where I have used viewports could just as easily be done by iteration now. I support the idea that we should table viewports - possibly indefinitely - and focus on general iteration and other more important bits. On Wed, Mar 16, 2016 at 12:41 PM, James Fuller <jim@webcomposite.com> wrote: > Thanks Alex great stuff. > > taking your strawman > > (A) 'An iteration of the sections that produces a sequence:' > > I think this is succinct and familiar > > $doc//section ! filter() > > (B) An iteration of the sections that produces an empty sequence: > > why not empty parens to represent sink() > > $doc//section ! () > > we probably need to come up with a more realistic example. > > (C) 'replacement of sections' > > $doc ~ //section ! filter() ~ > > Personally would like to avoid spending weeks/months/years discussing > C) ... I observe that viewports has taken up a huge amount of mental > capital for the WG over the years and while it is very useful I think > trying to solve it to achieve parity in the language is a rathole best > avoided. > > I also would think ironing out data literals and templating could > offer other routes for solving this. In any event, please ... let us > timebox our efforts - I would propose 2 weeks (2 WG meetings) from > today. > > After todays discussions, can we get agreement on: > > * as Alex proposed A) > * as Alex proposed B) > * use select() > > (for the record I agree with the above) > > I will try to write up example taking these into account as discussion > point for next week WG meeting. > > J > > > > > > > On 16 March 2016 at 20:09, Alex Miłowski <alex@milowski.com> wrote: >> In the proposal we had iteration using the "bang operator" (!). For >> example, if we have a sequence of documents in $seq and want to apply >> a single "filter step": >> >> $seq ! filter() >> >> which is nice because the left hand side is nominally always a >> sequence and the semantics are clear in that the step signature (or >> flow expression) must have a single input and single output. As such, >> the result is a sequence of the same length. >> >> This is opposed to some syntax for a "for loop" where the inputs and >> outputs need to be declared >> >> for $x in $seq output $y $x → step() ≫ $y >> >> and block expressions become more difficult to bind: >> >> for $x in $seq output $y $x → [$in,$out] { ... complicated stuff ... } ≫ $y >> >> I don't see the value in the above and we can't really apply "Henry's >> rule" where the body of the for loop should be able to be a single >> step invocation. >> >> >> The "replace" or "viewport" has several issues with a single operator. >> >> 1. We need to always treat the left hand side as a sequence. We can >> ignore the issue of whether there is implicit iteration for now >> (probably shouldn't do that). >> >> 2. When the left hand side is a sequence produced by some >> select/projection operation, the "book keeping" of the original >> locations in the document are lost. >> >> 3. The operator needs an expression (e.g., the pattern to select) >> >> 4. The expression might be dynamically computed. >> >> In the above, (4) is a new requirement that we didn't address in the >> previous XProc 2.0 draft. It feels like something we should consider. >> >> IMHO, we could do this well if select/projections were in the language >> as a native construct: >> >> Straw man syntax: >> >> (A) An iteration of the sections that produces a sequence: >> >> $doc//section ! filter() >> >> (B) An iteration of the sections that produces an empty sequence: >> >> $doc//section ! sink() >> >> (C) replacement of sections >> >> $doc ~ //section ! filter() ~ >> >> The replacement always starts with something that generates a >> sequence but we use the tilde to indicate where the bookkeeping starts >> and ends. >> >> This would be identity (or an error): >> >> $doc ~ //section ~ >> >> This would be an error: >> >> $doc ~ filter() ~ >> >> (D) replacement with a step chain (a nice shorthand) >> >> $doc ~ //section ! A() → B() → C() ~ >> >> (E) replacement with a block expression: >> >> $doc ~ //section ! [$in,$out] { $in → A() → B() → C() ≫ $out } ~ >> >> (F) replacement that effectively deletes: >> >> $doc ~ //section ! [$in,$out] { $in → A() → B() → C() } ~ >> >> We could allow this or make it an error as we do in XProc 1.0 >> >> >> Comments: >> >> 1. The projections in (A) and (B) can be done with a step: >> >> $doc → select("//section") ! filter() >> >> 2. Using a step for projections allows the expression to be computed: >> >> $params → select("/option[@name='target']/@expr") ≫ $expr >> $doc → select($expr) ! filter() >> >> 3. We could compute expressions in replacements using a step for projections: >> >> $doc ~ select($expr) ! filter() ~ >> >> but now that select step has special semantics as the start of the >> replacement. >> >> 4. Using step syntax for replacement is confusing (as we heard on the >> call today). >> >> 5. Having projections in the language as a native syntax is predicated >> on having a expression language natively in the language (see other >> discussion on that). >> >> -- >> --Alex Miłowski >> "The excellence of grammar as a guide is proportional to the paucity of the >> inflexions, i.e. to the degree of analysis effected by the language >> considered." >> >> Bertrand Russell in a footnote of Principles of Mathematics >> -- --Alex Miłowski "The excellence of grammar as a guide is proportional to the paucity of the inflexions, i.e. to the degree of analysis effected by the language considered." Bertrand Russell in a footnote of Principles of Mathematics
Received on Wednesday, 16 March 2016 20:22:06 UTC