- From: Imsieke, Gerrit, le-tex <gerrit.imsieke@le-tex.de>
- Date: Sat, 31 Oct 2020 20:17:59 +0100
- To: xproc-dev@w3.org
Hi David, Have you tried writing the result to file:///Users/djb/repos/cz/output/verb-1a.xml? XProc 3 has become a bit stricter about conformant URIs. Also when you said you provided an absolute path as the base URI, did you provide an /os/path or did you provide an actual file:///os/path URI? You might use the function p:urify() on OS paths in order to get URIs, but I’m not sure about the implementation status in Morgana. And when you specified <p:output port="secondary" sequence="true"/>, did you connect the output to the secondary output of your XSLT step? If you want to store the secondary outputs of the XSLT step named 'generate' to disk, you need to do something like <p:for each> <p:with-input pipe="secondary@generate"/> <p:store href="{base-uri()}"/> </p:for-each> Grouping in XProc isn't that powerful. There’s a group-adjacent option in p:wrap-sequence, but grouping XML by a common value is best done in XSLT. Gerrit On 31.10.2020 20:02, David Birnbaum wrote: > Dear xproc-dev, > > I would be grateful for advice about how best to manage a pipeline that > requires me to generate and then continue to process multiple output > documents from a single input. The input contains 110k <item> elements > that are distinguished by a @paradigm attribute on the <item> element; > there are about 150 different @paradigm values in the input. I would > like to group the <item> elements by their @paradigm values, process > each group, and write the outputs for each group separately to disk. I > would also like to run another transformation over those outputs and > write the results of that transformation to disk, as well. I have poked > at the following approaches and run into trouble with both of them, > probably because (or, at least, partially because) I am not (yet, I > hope!) very adept at XProc: > > 1. Within the XProc, I run an XSLT step that uses <xsl:for-each-group> > and <xsl:result-document> to create separate output for each group, with > constructed output @href values. This errors out with: > > <c:errors xmlns:c="http://www.w3.org/ns/xproc-step > <http://www.w3.org/ns/xproc-step>"><c:error code="err:XC0121" > name="generate" type="p:xslt" > href="file:///Users/djb/repos/cz/pos/verb/verb.xpl" line="64" > column="27" xmlns:p="http://www.w3.org/ns/xproc > <http://www.w3.org/ns/xproc>" > xmlns:err="http://www.w3.org/ns/xproc-error > <http://www.w3.org/ns/xproc-error>"><message>URI > '/Users/djb/repos/cz/output/verb-1a.xml' of secondary result is not > valid or not absolute.</message></c:error></c:errors> > > I had first tried a relative path for the @href on the > <xsl:result-document>, and I thought the error message meant that there > was no base URI within the pipeline, so I specified an absolute path > instead, but, as seen above, that raises the same error. I did specify a > secondary port in the XProc with: > > <p:output port="secondary" sequence="true"/> > > but that seems to have no effect on the outcome (perhaps I specified it > in the wrong place?). I think I should be able to write multiple result > documents, and that I have misunderstood something about how to set that > up. For what it's worth, I also think I may need a <p:store> step to > save the multiple result documents, and although I've used <p:store> > successfully with single outputs, I don't know what it should look like > to save a set of result documents. But if I've understood the error > correctly, I'm stalled on the XSLT step, and need to get past that first. > > 2. As an alternative to <xsl:for-each-group> inside the XSLT stylesheet, > I considered doing the grouping in XProc, but I don't see anything > within XProc comparable to <xsl:for-each-group>. If I am reading the > description correctly, a <p:for-each> step might let me loop over <item> > elements, but it does not appear to have the ability to form the <item> > elements into groups according to shared @paradigm values and loop over > those groups. I could run an XSLT pre-processing step to do the > grouping, all within our document, creating an intermediate hierarchical > level (called, say, <group>) and then use <p:for-each> to loop over > those, but that extra step feels to me like a hack, that is, as if there > should be a more direct way to do what I need. Should I ignore that feeling? > > Assuming I can get the individual result documents written to disk, I > think I can do the subsequent transformation with a <p:for-each> step. > > I am using MorganaXProc-IIIse 0.9.4.2-beta and Saxon EE 10.0, and > running from the command line under MacOS 10.15.7. Thanks in advance for > any pointers in The Right Direction. > > Best, > > David > djbpitt@gmail.com <mailto:djbpitt@gmail.com> > > > -- Gerrit Imsieke Geschäftsführer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@le-tex.de, http://www.le-tex.de Registergericht / Commercial Register: Amtsgericht Leipzig Registernummer / Registration Number: HRB 24930 Geschäftsführer / Managing Directors: Gerrit Imsieke, Svea Jelonek, Thomas Schmidt
Received on Saturday, 31 October 2020 19:18:19 UTC