Re: result documents in an XSLT step?

Dear Geert (cc xproc-dev),

Thank you for this suggestion! Gerrit's advice about URI expectations and
storing result documents resolves the issues I reported, but it also feels
more direct to do the grouping (even if indirectly, by way of filtering)
inside XProc, since that puts the XSLT in charge only of transformation,
and lets XProc oversee the file management details. I will try your
filtering suggestion, as well, and report the results, probably tomorrow.

Best,

David

On Sat, Oct 31, 2020 at 3:20 PM Geert Bormans <geert@gbormans.telenet.be>
wrote:

> Hi David,
>
> Have you considered doing a...
> p:for-each on the distinct values of the item/@paradigm in the source XML
> have a p:xslt inside the p:for-each that takes the paradigm as a filter
> parameter (so don't group but filter)
> and p:store the result inside the for-each
>
> Met vriendelijke groeten,
> Best regards,
>
> Geert Bormans
>
> ----- Op 31 okt 2020 om 20:02 schreef David Birnbaum <djbpitt@gmail.com>:
>
> Dear xproc-dev,
> I would be grateful for advice about how best to manage a pipeline that
> requires me to generate and then continue to process multiple output
> documents from a single input. The input contains 110k <item> elements that
> are distinguished by a @paradigm attribute on the <item> element; there are
> about 150 different @paradigm values in the input. I would like to group
> the <item> elements by their @paradigm values, process each group, and
> write the outputs for each group separately to disk. I would also like to
> run another transformation over those outputs and write the results of that
> transformation to disk, as well. I have poked at the following approaches
> and run into trouble with both of them, probably because (or, at least,
> partially because) I am not (yet, I hope!) very adept at XProc:
>
> 1. Within the XProc, I run an XSLT step that uses <xsl:for-each-group> and
> <xsl:result-document> to create separate output for each group, with
> constructed output @href values. This errors out with:
>
> <c:errors xmlns:c="http://www.w3.org/ns/xproc-step"><c:error
> code="err:XC0121" name="generate" type="p:xslt"
> href="file:///Users/djb/repos/cz/pos/verb/verb.xpl" line="64" column="27"
> xmlns:p="http://www.w3.org/ns/xproc" xmlns:err="
> http://www.w3.org/ns/xproc-error"><message>URI
> '/Users/djb/repos/cz/output/verb-1a.xml' of secondary result is not valid
> or not absolute.</message></c:error></c:errors>
>
> I had first tried a relative path for the @href on the
> <xsl:result-document>, and I thought the error message meant that there was
> no base URI within the pipeline, so I specified an absolute path
> instead, but, as seen above, that raises the same error. I did specify a
> secondary port in the XProc with:
>
> <p:output port="secondary" sequence="true"/>
>
> but that seems to have no effect on the outcome (perhaps I specified it in
> the wrong place?). I think I should be able to write multiple result
> documents, and that I have misunderstood something about how to set that
> up. For what it's worth, I also think I may need a <p:store> step to save
> the multiple result documents, and although I've used <p:store>
> successfully with single outputs, I don't know what it should look like to
> save a set of result documents. But if I've understood the error correctly,
> I'm stalled on the XSLT step, and need to get past that first.
>
> 2. As an alternative to <xsl:for-each-group> inside the XSLT stylesheet, I
> considered doing the grouping in XProc, but I don't see anything within
> XProc comparable to <xsl:for-each-group>. If I am reading the description
> correctly, a <p:for-each> step might let me loop over <item> elements, but
> it does not appear to have the ability to form the <item> elements into
> groups according to shared @paradigm values and loop over those groups. I
> could run an XSLT pre-processing step to do the grouping, all within our
> document, creating an intermediate hierarchical level (called, say,
> <group>) and then use <p:for-each> to loop over those, but that extra step
> feels to me like a hack, that is, as if there should be a more direct way
> to do what I need. Should I ignore that feeling?
>
> Assuming I can get the individual result documents written to disk, I
> think I can do the subsequent transformation with a <p:for-each> step.
>
> I am using MorganaXProc-IIIse 0.9.4.2-beta and Saxon EE 10.0, and running
> from the command line under MacOS 10.15.7. Thanks in advance for any
> pointers in The Right Direction.
>
> Best,
>
> David
> djbpitt@gmail.com
>
>

Received on Saturday, 31 October 2020 20:30:53 UTC