Re: The first five minutes ... a thought experiment (long) from Jostein Austvik Jacobsen on 2014-02-18 (xproc-dev@w3.org from February 2014)

From: Jostein Austvik Jacobsen <josteinaj@gmail.com>
Date: Tue, 18 Feb 2014 10:53:40 +0100
To: Paul Mensonides <pmenso57@comcast.net>
Cc: XProc Dev <xproc-dev@w3.org>
Message-ID: <CAOCxfQcZ5Pc_Z_xJarJG+XouN_=uCURQBZE+m5wBmFL5XCtqBg@mail.gmail.com>
>
> For the real world projects that I need something like this for, I fear it
> will potentially be a very large problem, and it may be that I have to have
> small partial pipelines being invoked via a makefile.


As a side-note; I find it is generally a good idea to modularize your XProc
as much as possible anyway (also, don't inline your XSLTs; make them
separate files). It makes testing and debugging easier. Some linux grep'ing
tells me that I have 390 XProc files on my computer, with an average of 126
lines per file (ranging from 3 to 1350, std. dev. 103).

Jostein


On 17 February 2014 23:21, Paul Mensonides <pmenso57@comcast.net> wrote:

> (Accidentally sent this reply to James instead of list--sorry James!)
>
>
> On 2/17/2014 5:40 AM, James Fuller wrote:
>
>  The point of going through this evolution of xproc scripts, is to
>> remind us all that for newbies this process of learning typically
>> results in frustration, because;
>>
>> I) XProc basic operation works sometimes differently then my
>> preconceptions
>>
>> II) I have to learn many concepts before I get something running
>>
>> III) and/or I have to learn a few things about execution environment
>> (commandline options, oXygenXML setup)
>>
>> All of use being life long autodidacts are not afraid of learning, but
>> there should be symmetry in the learning process ... all we are trying
>> to do is run an xslt transform and save its output.
>>
>> As it stands with XProc v1, we are asking people to do a lot then what
>> they can do today with some other easier to comprehend tool/utility.
>>
>
> 2c
>
> As someone that just recently started using XProc (and learning XML Schema
> 1.1 and XSLT/XPath 2.0 at the same time), I have not had a particularly
> hard time figuring it out.
>
> -----
>
> There are some things that I have found bizarre.  In particular, why are
> "documents" flowing through pipes rather that just (e.g.) XPath data values
> (of which documents are just one form).  Related to that, documents and
> parameters (for e.g. XSLT) are all more or less XPath data values, and it
> would be nice if there was a uniform way that such data could be passed
> around.  But that hasn't been difficult to figure out, just noted.
>
> -----
>
> The thing that has frustrated me in particular is the way that versioning
> is handled throughout many of the XML-related standards. XProc does some
> things right: ability to specify required XPath version and required XSLT
> version.  But no version requirement for XML Schema? No schema-aware XSLT
> requirements?  What about branching off of these values?  What I ended up
> doing (since I needed a pipeline that could be executed with and without
> schema support) was create an option for this AND pass around Saxon
> configuration files.  E.g. this is my invocation:
>
> java com.xmlcalabash.drivers.Main --saxon-configuration=saxon-ee.xml
> pipeline.xpl schema-aware=yes
>
> and
>
> java com.xmlcalabash.drivers.Main --saxon-configuration=saxon-he.xml
> pipeline.xpl schema-aware=no
>
> The only thing the the Saxon configuration files are doing is specifying
> XML Schema version and turning schema-aware XSLT on and off.
>
> AFAIK, this stuff is unspecifiable in the pipeline itself (at least
> without extensions).  I would like to have:
>
> java com.xmlcalabash.drivers.Main pipeline.xpl
>
> -----
>
> So far I have found the default piping stuff more bewildering than
> helpful--mostly because it hides what is actually happening.  I have just
> started using XProc.  The reason for doing so is because I have processing
> tasks that aren't just a simple linear pipelines.  If I just wanted to take
> some XML data through XSLT, I wouldn't bother to set up a pipeline.  So, at
> this point, I just explicitly specify all connections between ports.
>
> -----
>
> I haven't run across it yet, but I am worried about the lack of the
> ability to cache intermediate results in a direct way.  Viewing a pipeline
> as a sort of makefile, running the pipeline is equivalent to a complete
> rebuild.  For the project that I am using to learn all of this stuff, this
> doesn't matter that much.  For the real world projects that I need
> something like this for, I fear it will potentially be a very large
> problem, and it may be that I have to have small partial pipelines being
> invoked via a makefile.  The potential benefit of streaming over
> serialization and infosets (or whatever they are called) versus re-parsing
> are unrealized in this sort of scenario.
>
> -----
>
> Overall, my impression has been positive.  I am far from an expert, but I
> personally don't really care about the first five minutes.  I care about
> comprehensiveness and viability in real scenarios, not toys.  I don't want
> to bother with a technology that can handle simple toy examples which maybe
> grow into bigger things that might eventually hit a brick wall because I
> was a "newbie" and couldn't be bothered to learn.
>
> Regards,
> Paul Mensonides
>
>
>
>
Received on Tuesday, 18 February 2014 09:54:29 UTC