Re: The first five minutes ... and the next 50,0000 minutes

Unfortunately . . . I would have to disagree - strongly that this is 
just a "first five minutes" problem.

In my own case, xProc lost a quarter of my own development team at 500 
minutes, another quarter at 5,000 minutes and the remaining die-hards at 
50,000 minutes!

To be clear, these are different issues:

     A) the first type of loss (at 5 minutes or 500 minutes) is due to 
the "*significant barrier to comprehension**"
**
*    B) the second type of loss (at 50,000 minutes) is due to an 
"*inelegance*" of the language. In other words, a lack of tight 
consistent use of semantics and features that should provide simplicity 
and power.

When a language suffers from "type A", people are slow to adopt it. But 
when a language suffers from "type B", the few endure the learning curve 
abandon it as a poor investment of time.

My own development team definitely found we were better off crafting our 
own solution than suffering through a longer term investment in xProc. I 
wrote about my teams experience just before last years XML Prague 
(http://lists.w3.org/Archives/Public/xproc-dev/2013Feb/0005.html). One 
strong indication for me of a "type B" failure in a language, is what I 
call the "9 month rule". In other words, if I write reasonable junk of 
code (1000 lines +) and use the best programming conventions the 
language allows me - can I easily understand my own code after I put it 
aside for 9 months? Time and again xProc has failed the "9 month rule" 
for me.

Writing a programming language is hard, writing a good programming 
lanaguage is extremely hard. To my knowledge, writing a good programming 
language is always the product of iteration (whether in it's own native 
form or its ancestors).

So in my humble opinion, xProc will limp along as the pet project of the 
XML elite until there is a "wholesale reductionism" or "significant 
evolution" of the syntax.

Regards,

Christopher


On 2/17/2014 8:40 AM, James Fuller wrote:
> Hello All,
>
> With the dust settling on XML Prague, I've tried to make a few
> observations based on feedback collected over the weekend. For some of
> the more involved thoughts, I will send through separate
> communications over the coming days/weeks/months.
>
> But thought I would 'shoot from the hip' on one topic eg. the crucial
> first five minutes of usage by someone investigating XProc for the
> very first time;
>
> I) People know and love pipelines and have a set of preconceptions 'in
> wetware', before they come to XProc, about how pipelines should work.
>
> II) XProc balances off many engineering choices to handle the vagaries
> of managing pipelines big and small; its not trivial dealing with
> pipelines that go beyond simple 'piping output from input' between
> steps.
>
> Many, many people repeated to me that XProc does poorly in the first
> five minutes, in fact, it takes several sessions before basic concepts
> crystallize. Many people give up at this stage but those that make it
> through, turn into hard core XProc users, as they have run up and over
> the learning curve.
>
> The prospects of adoption with this 'unfriendly' first five minutes,
> makes adoption beyond XML hard core less likely. That being said, if
> we get the 'first five minutes' scenario right, then the broader group
> of all those unix pipeline 'lovers' should be able to comprehend
> things quickly and they will be happy to learn more if the return is
> worth it.
>
> I don't think we need to embark on some kind of wholesale reductionism
> of basic XProc primitives, beyond what we have outlined already in
> vnext spec. For example, Romain Deltour's recent email on
> rationalizing inputs with options, while perceptive and well reasoned,
> is a larger set of change we should probably avoid in v2 for reasons
> of time/space and I think we can achieve the same effect, with less
> 'cuts of the scalpel'.
>
> That being said, there are a lot of good ideas from Romain's email
> that the WG will no doubt look deeply into (thx Romain for the brain
> food!).
>
> As an experiment, lets run through an evolutionary series of xproc
> pipelines, loosely based on a real world examples, from users met over
> the XML Prague weekend.
>
> ----------------------------------------------------------------------
> Single (or Multiple) XSLT transformation pipeline
> ----------------------------------------------------------------------
>
> Lets say we want to try out doing a simple XSLT transform, in XProc,
> where I provide some source and define XSLT transform, and want to
> save the results to disk.
>
> I diligently brush up on all things XProc and fire up oXygenXML (or
> download calabash) and come up with the following as my first stab at
> a pipeline;
>
> <?xml version="1.0" encoding="UTF-8"?>
> <p:pipeline xmlns:p="http://www.w3.org/ns/xproc"
>    xmlns:c="http://www.w3.org/ns/xproc-step" version="1.0">
>
>    <p:xslt>
>      <p:input port="stylesheet">
>        <p:document href="rce2sp.xsl"/>
>      </p:input>
>    </p:xslt>
>
>    <p:store href="data.xml"/>
>
> </p:pipeline>
>
> I already had to take on board a few XProcisms like basic principles
> of port bindings and how documents flow through pipelines. I am unsure
> of how to set data input, I see p:document and learn about p:pipeline
> being a bit of syntactic sugar, so I quickly rewrite too
>
> <?xml version="1.0" encoding="UTF-8"?>
> <p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
>    xmlns:c="http://www.w3.org/ns/xproc-step" version="1.0">
>
>      <p:input port="source" sequence="false">
>          <p:document href="data.xml"/>
>      </p:input>
>      <p:output port="result"/>
>
>    <p:xslt>
>      <p:input port="stylesheet">
>        <p:document href="rce2sp.xsl"/>
>      </p:input>
>    </p:xslt>
>
>    <p:store href="data.xml"/>
>
> </p:declare-step>
>
> When I run this script, the XProc processor complains about the XSLT
> step needing parameters. So I read up again, ask the interwebs, review
> the mailing lists and come up with;
>
> <p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
> xmlns:c="http://www.w3.org/ns/xproc-step"
>      version="1.0">
>
>      <p:input port="source" sequence="false">
>          <p:document href="data.xml"/>
>      </p:input>
>
>      <p:output port="result"/>
>
>      <p:xslt>
>          <p:input port="stylesheet">
>              <p:document href="test.xsl"/>
>          </p:input>
>          <p:input port="parameters">
>              <p:empty/>
>          </p:input>
>      </p:xslt>
>
>      <p:store href="data.xml"/>
>
> </p:declare-step>
>
> I have no desire to use parameters, so I learn about the trick of
> setting them to p:empty, which is strange. I still get an error about
> unbound ports, hmmmm .... back to the docs ... read some more, learn
> some more ....
>
> <p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
> xmlns:c="http://www.w3.org/ns/xproc-step"
>      version="1.0">
>
>      <p:input port="source" sequence="false">
>          <p:document href="data.xml"/>
>      </p:input>
>
>      <p:output port="result" sequence="true">
>          <p:empty/>
>      </p:output>
>
>      <p:xslt>
>          <p:input port="stylesheet">
>              <p:document href="test.xsl"/>
>          </p:input>
>          <p:input port="parameters">
>              <p:empty/>
>          </p:input>
>      </p:xslt>
>
>      <p:store href="output.xml"/>
>
> </p:declare-step>
>
> I run this and have successful output, but at this stage, I don't
> understand a number of concepts ... some are anachronistic like; whats
> this about setting sequences on ports or why do I have to set
> something to 'empty' for parameters. But some concepts run counter to
> my intuition about pipelines, where I expect some kind of output by
> default. By this stage, its worrying that I have to somehow care about
> managing the end result port or be so explicit with my pipeline
> definition.
>
> Alternately, someone could have arrived at a different XProc script at
> the start, for example;
>
> <p:pipeline xmlns:p="http://www.w3.org/ns/xproc"
> xmlns:c="http://www.w3.org/ns/xproc-step"
>      version="1.0">
>      <p:xslt>
>          <p:input port="stylesheet">
>              <p:document href="test.xsl"/>
>          </p:input>
>      </p:xslt>
> </p:pipeline>
>
> or this
>
> <p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
> xmlns:c="http://www.w3.org/ns/xproc-step"
>      version="1.0">
>      <p:input port="source"/>
>      <p:output port="result"/>
>      <p:input port="parameters" kind="parameter"/>
>      <p:xslt>
>          <p:input port="stylesheet">
>              <p:document href="test.xsl"/>
>          </p:input>
>      </p:xslt>
> </p:declare-step>
>
> but for these to run without error, one would have too know how to set
> commandline switches (or oXygenXML setup) so that parameters are set,
> to get this running correctly.
>
> The point of going through this evolution of xproc scripts, is to
> remind us all that for newbies this process of learning typically
> results in frustration, because;
>
> I) XProc basic operation works sometimes differently then my preconceptions
>
> II) I have to learn many concepts before I get something running
>
> III) and/or I have to learn a few things about execution environment
> (commandline options, oXygenXML setup)
>
> All of use being life long autodidacts are not afraid of learning, but
> there should be symmetry in the learning process ... all we are trying
> to do is run an xslt transform and save its output.
>
> As it stands with XProc v1, we are asking people to do a lot then what
> they can do today with some other easier to comprehend tool/utility.
>
> Stepping back, I think XProc v1 gets the hairy things right (hence the
> previous caution of hacking away at it) because the WG worked through
> many serious issues with much thoughtful debate underpinning design
> decisions.
>
> So, what might be a better first five minute experience for the newbie user ?
>
> I) Thought experiment #1
>
> <p:pipeline>
>     <p:xslt stylesheet-href="test.xsl"/>
> </p:pipeline>
>
>> xproc -p mypipeline.xpl data.xml
> * we could consider some kind of alt port mechanism where a p:document
> href could be represented by a specially named option (uggg...)
> *  a shell script, called xproc, where we put the data flowing through
> the pipeline 'front and centre'
> * default scenario should not require setting something to empty (like params)
>
>
> II) Thought experiment #2
>
> <p:pipeline>
>     <p:xslt stylesheet-href ="test.xsl" result-href="step1out.xml"/>
>     <p:xslt stylesheet-href ="test1.xsl"/>
>     <p:xslt stylesheet-href ="test2.xsl" result-href ="step2out.xml"/>
>     <p:xslt stylesheet-href ="test3.xsl"/>
> </p:pipeline>
>
>> xproc -p mypipeline.xpl data.xml data2.xml
> * we could do some kind of syntax sugar by allowing p:document href to
> be set with an option
> * we let data continue flowing pipeline through as a default posture
> (multiple result output bindings) which would lessen confusion caused
> by using p:store
> * let users easily 'dip' into the data stream and save intermediate
> steps to make the process transparent and easy to debug
>
> III) Caveats
>
> CAVEAT #1 - I am not strongly advocating specifically doing I) or II),
> this is 'shooting from the hip' type thinking and not fully baked.
>
> CAVEAT #2 - The WG is well aware of some of the problems (like
> parameters) and some parts of v2 requirements hopefully will address
> those shortcomings
>
> CAVEAT #3 - To repeat, I think XProc v1 just needs the 'final mile' to
> be carefully constructed and communicated, not wholesale changes.
>
> IV) Summary
>
> I am trying to convey how important it is to cater for the 'first five
> minute' scenario. If we get this wrong in v2, then there is no 'first
> day', 'first month' or 'first year' scenario.
>
> Any additional examples that illustrate the newbie's plight would be
> most useful, as well as any additional comment.
>
> Jim Fuller
>

Received on Monday, 17 February 2014 14:43:48 UTC