RE: The first five minutes ... a thought experiment (long)

I support this (also coming from a cocoon background), I think a href on p:xslt would be very good and useful syntactic sugar.

However, the original stylesheet input port must not disappear (so you keep the ability to generate stylesheets in your pipeline)

Erik Siegel

-----Oorspronkelijk bericht-----
Van: Geert J. [mailto:geert.josten@dayon.nl] 
Verzonden: woensdag 19 februari 2014 8:49
Aan: James Fuller; XProc Dev
Onderwerp: RE: The first five minutes ... a thought experiment (long)

A lot has been said, still need to read up most unfortunately. But just a short reply on the example. My first stab at XProc (if I hadn't taken the unorthodox approach that I did, building my ebook proc) would have been:

<p:pipeline>
 <p:load href="myinput.xml"/>
 <p:xslt href="mytransform.xsl"/>
 <p:store href="myoutput.xml"/>
</p:pipeline>

Which resembles Cocoon sitemap approach a lot. And anyone used to Cocoon sitemaps knows how easy it is to tie Cocoon pipes to each other, where in XProc it involves lots of verbose syntax to point to specific step ports that need to be in scope as well..

XProc doesn't do that bad though. The only thing lacking here is the href on p:xslt. If that were present, you could easily chain lots of xslt's as well, by simply repeating the p:xslt:

<p:pipeline>
 <p:load href="myinput.xml"/>
 <p:xslt href="mytransform.xsl"/>
 <p:xslt href="mytransform2.xsl"/>
 <p:xslt href="mytransform3.xsl"/>
 <p:store href="myoutput.xml"/>
</p:pipeline>

Maybe we should not only focus on the bad parts of XProc, but also on the good parts..

Cheers,
Geert

> -----Oorspronkelijk bericht-----
> Van: James Fuller [mailto:jim@webcomposite.com]
> Verzonden: maandag 17 februari 2014 14:41
> Aan: XProc Dev
> Onderwerp: The first five minutes ... a thought experiment (long)
>
> Hello All,
>
> With the dust settling on XML Prague, I've tried to make a few 
> observations based on feedback collected over the weekend. For some of 
> the more involved thoughts, I will send through separate 
> communications over the coming days/weeks/months.
>
> But thought I would 'shoot from the hip' on one topic eg. the crucial 
> first five minutes of usage by someone investigating XProc for the 
> very first time;
>
> I) People know and love pipelines and have a set of preconceptions 'in 
> wetware', before they come to XProc, about how pipelines should work.
>
> II) XProc balances off many engineering choices to handle the vagaries 
> of managing pipelines big and small; its not trivial dealing with 
> pipelines that go beyond simple 'piping output from input' between 
> steps.
>
> Many, many people repeated to me that XProc does poorly in the first 
> five minutes, in fact, it takes several sessions before basic concepts 
> crystallize. Many people give up at this stage but those that make it 
> through, turn into hard core XProc users, as they have run up and over 
> the learning curve.
>
> The prospects of adoption with this 'unfriendly' first five minutes, 
> makes adoption beyond XML hard core less likely. That being said, if 
> we get the 'first five minutes' scenario right, then the broader group 
> of all those unix pipeline 'lovers' should be able to comprehend 
> things quickly and they will be happy to learn more if the return is 
> worth it.
>
> I don't think we need to embark on some kind of wholesale reductionism 
> of basic XProc primitives, beyond what we have outlined already in 
> vnext spec. For example, Romain Deltour's recent email on 
> rationalizing inputs with options, while perceptive and well reasoned, 
> is a larger set of change we should probably avoid in v2 for reasons 
> of time/space and I think we can achieve the same effect, with less 
> 'cuts of the scalpel'.
>
> That being said, there are a lot of good ideas from Romain's email 
> that the WG will no doubt look deeply into (thx Romain for the brain 
> food!).
>
> As an experiment, lets run through an evolutionary series of xproc 
> pipelines, loosely based on a real world examples, from users met over 
> the XML Prague weekend.
>
> ----------------------------------------------------------------------
> Single (or Multiple) XSLT transformation pipeline
> ----------------------------------------------------------------------
>
> Lets say we want to try out doing a simple XSLT transform, in XProc, 
> where I provide some source and define XSLT transform, and want to 
> save the results to disk.
>
> I diligently brush up on all things XProc and fire up oXygenXML (or 
> download calabash) and come up with the following as my first stab at 
> a pipeline;
>
> <?xml version="1.0" encoding="UTF-8"?> <p:pipeline 
> xmlns:p="http://www.w3.org/ns/xproc"
>   xmlns:c="http://www.w3.org/ns/xproc-step" version="1.0">
>
>   <p:xslt>
>     <p:input port="stylesheet">
>       <p:document href="rce2sp.xsl"/>
>     </p:input>
>   </p:xslt>
>
>   <p:store href="data.xml"/>
>
> </p:pipeline>
>
> I already had to take on board a few XProcisms like basic principles 
> of port bindings and how documents flow through pipelines. I am unsure 
> of how to set data input, I see p:document and learn about p:pipeline 
> being a bit of syntactic sugar, so I quickly rewrite too
>
> <?xml version="1.0" encoding="UTF-8"?> <p:declare-step 
> xmlns:p="http://www.w3.org/ns/xproc"
>   xmlns:c="http://www.w3.org/ns/xproc-step" version="1.0">
>
>     <p:input port="source" sequence="false">
>         <p:document href="data.xml"/>
>     </p:input>
>     <p:output port="result"/>
>
>   <p:xslt>
>     <p:input port="stylesheet">
>       <p:document href="rce2sp.xsl"/>
>     </p:input>
>   </p:xslt>
>
>   <p:store href="data.xml"/>
>
> </p:declare-step>
>
> When I run this script, the XProc processor complains about the XSLT 
> step needing parameters. So I read up again, ask the interwebs, review 
> the mailing lists and come up with;
>
> <p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
> xmlns:c="http://www.w3.org/ns/xproc-step"
>     version="1.0">
>
>     <p:input port="source" sequence="false">
>         <p:document href="data.xml"/>
>     </p:input>
>
>     <p:output port="result"/>
>
>     <p:xslt>
>         <p:input port="stylesheet">
>             <p:document href="test.xsl"/>
>         </p:input>
>         <p:input port="parameters">
>             <p:empty/>
>         </p:input>
>     </p:xslt>
>
>     <p:store href="data.xml"/>
>
> </p:declare-step>
>
> I have no desire to use parameters, so I learn about the trick of 
> setting them to p:empty, which is strange. I still get an error about 
> unbound ports, hmmmm .... back to the docs ... read some more, learn 
> some more ....
>
> <p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
> xmlns:c="http://www.w3.org/ns/xproc-step"
>     version="1.0">
>
>     <p:input port="source" sequence="false">
>         <p:document href="data.xml"/>
>     </p:input>
>
>     <p:output port="result" sequence="true">
>         <p:empty/>
>     </p:output>
>
>     <p:xslt>
>         <p:input port="stylesheet">
>             <p:document href="test.xsl"/>
>         </p:input>
>         <p:input port="parameters">
>             <p:empty/>
>         </p:input>
>     </p:xslt>
>
>     <p:store href="output.xml"/>
>
> </p:declare-step>
>
> I run this and have successful output, but at this stage, I don't 
> understand a number of concepts ... some are anachronistic like; whats 
> this about setting sequences on ports or why do I have to set 
> something to 'empty' for parameters. But some concepts run counter to 
> my intuition about pipelines, where I expect some kind of output by 
> default. By this stage, its worrying that I have to somehow care about 
> managing the end result port or be so explicit with my pipeline 
> definition.
>
> Alternately, someone could have arrived at a different XProc script at 
> the start, for example;
>
> <p:pipeline xmlns:p="http://www.w3.org/ns/xproc"
> xmlns:c="http://www.w3.org/ns/xproc-step"
>     version="1.0">
>     <p:xslt>
>         <p:input port="stylesheet">
>             <p:document href="test.xsl"/>
>         </p:input>
>     </p:xslt>
> </p:pipeline>
>
> or this
>
> <p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
> xmlns:c="http://www.w3.org/ns/xproc-step"
>     version="1.0">
>     <p:input port="source"/>
>     <p:output port="result"/>
>     <p:input port="parameters" kind="parameter"/>
>     <p:xslt>
>         <p:input port="stylesheet">
>             <p:document href="test.xsl"/>
>         </p:input>
>     </p:xslt>
> </p:declare-step>
>
> but for these to run without error, one would have too know how to set 
> commandline switches (or oXygenXML setup) so that parameters are set, 
> to get this running correctly.
>
> The point of going through this evolution of xproc scripts, is to 
> remind us all that for newbies this process of learning typically 
> results in frustration, because;
>
> I) XProc basic operation works sometimes differently then my 
> preconceptions
>
> II) I have to learn many concepts before I get something running
>
> III) and/or I have to learn a few things about execution environment 
> (commandline options, oXygenXML setup)
>
> All of use being life long autodidacts are not afraid of learning, but 
> there should be symmetry in the learning process ... all we are trying 
> to do is run an xslt transform and save its output.
>
> As it stands with XProc v1, we are asking people to do a lot then what 
> they can do today with some other easier to comprehend tool/utility.
>
> Stepping back, I think XProc v1 gets the hairy things right (hence the 
> previous caution of hacking away at it) because the WG worked through 
> many serious issues with much thoughtful debate underpinning design 
> decisions.
>
> So, what might be a better first five minute experience for the newbie
user ?
>
> I) Thought experiment #1
>
> <p:pipeline>
>    <p:xslt stylesheet-href="test.xsl"/> </p:pipeline>
>
> >xproc -p mypipeline.xpl data.xml
>
> * we could consider some kind of alt port mechanism where a p:document 
> href could be represented by a specially named option (uggg...)
> *  a shell script, called xproc, where we put the data flowing through 
> the pipeline 'front and centre'
> * default scenario should not require setting something to empty (like
> params)
>
>
> II) Thought experiment #2
>
> <p:pipeline>
>    <p:xslt stylesheet-href ="test.xsl" result-href="step1out.xml"/>
>    <p:xslt stylesheet-href ="test1.xsl"/>
>    <p:xslt stylesheet-href ="test2.xsl" result-href ="step2out.xml"/>
>    <p:xslt stylesheet-href ="test3.xsl"/> </p:pipeline>
>
> >xproc -p mypipeline.xpl data.xml data2.xml
>
> * we could do some kind of syntax sugar by allowing p:document href to 
> be set with an option
> * we let data continue flowing pipeline through as a default posture 
> (multiple result output bindings) which would lessen confusion caused 
> by using p:store
> * let users easily 'dip' into the data stream and save intermediate 
> steps to make the process transparent and easy to debug
>
> III) Caveats
>
> CAVEAT #1 - I am not strongly advocating specifically doing I) or II), 
> this is 'shooting from the hip' type thinking and not fully baked.
>
> CAVEAT #2 - The WG is well aware of some of the problems (like
> parameters) and some parts of v2 requirements hopefully will address 
> those shortcomings
>
> CAVEAT #3 - To repeat, I think XProc v1 just needs the 'final mile' to 
> be carefully constructed and communicated, not wholesale changes.
>
> IV) Summary
>
> I am trying to convey how important it is to cater for the 'first five 
> minute' scenario. If we get this wrong in v2, then there is no 'first 
> day', 'first month' or 'first year' scenario.
>
> Any additional examples that illustrate the newbie's plight would be 
> most useful, as well as any additional comment.
>
> Jim Fuller

Received on Wednesday, 19 February 2014 08:39:59 UTC