- From: Paul Tyson <phtyson@sbcglobal.net>
- Date: Tue, 11 Aug 2015 08:20:10 -0500
- To: Dan Vint <dvint@dvint.com>
- Cc: xproc-dev@w3.org
One other suggestion: if at the end all you want is a zip file, you might be able to do it all in the pipeline stream without serializing anything to files. Just arrange for a zip manifest and a sequence of documents to appear on the appropriate input ports of a pxp:zip step. The hrefs of the documents must match the hrefs in the manifest entries, and the entry names in the manifest will determine the directory structure of the zip archive. (At least that's what I guess based on the bit of documentation for pxp:zip at http://exproc.org/proposed/steps/other.html, and my understanding of java zip methods.) Regards, --Paul On Mon, 2015-08-10 at 19:59 -0700, Dan Vint wrote: > At 07:31 PM 8/10/2015, Paul Tyson wrote: > >Take courage, Dan! > > > >I'm several months past a few complex xproc jobs, during which I learned > >quite a bit, starting from very little. I won't venture to offer many > >specifics because I don't recall all the details, and haven't worked > >with xproc since then. But a few general comments below. > > > >On Mon, 2015-08-10 at 16:32 -0700, Dan Vint wrote: > > > Ok, moving the serialization information to the > > > store makes better sense to me. I was also able > > > to work around my issue with generating text > > > instead of XML. I setup the stylesheet to wrap > > > the text output in a dummy element. Then setting > > > the store output to be text then strips out the > > > new element, so I get the text the way I want it. > > > > > > In working through these issues I think I may not > > > be using the correct tools. Is XPROC just meant > > > for passing XML from one task to another and not > > > a more general scripting tool that understands typical XML processes? > > > > > > >XProc 1.0 is specifically focused on XML pipelines, and the spec only > >expects XML to go between steps. (Xproc 2 may relax this restriction.) > >But as you've already discovered, all it takes to turn text into xml are > >element start and end tags. > > > >>>Can't wait that long ;-) > > > > > Here is what I'm trying to do overall: > > > > > > 1) Take a DITA Map file > > > and process the structure and links to > > make a new flattened single file map. > > > >No problem, that's in xproc's wheelhouse, as you've probably already > >learned. > > >>>Yep with the serialization trick I'm now > getting what I wanted from this step. > > > > > > 2) I want to store that flattened map to an XML > > > file that I will keep around, but I also want to > > > pass it to another stylesheet that creates a > > > batch file based upon the information in the XML. > > > I don't want to process this result anymore. > > > >If you have the freedom to reimplement the batch file, you might be able > >to recast it into xproc. > > > >>>The batch file needs to create folders and > copy files into different places. I suppose that > could be done with exec statements in XPROC. I > just had the XSLT to produce the batch file already in place. > > > > >When you talk about "keeping the flattened xml file around", you can of > >course write it to file if needed, but you can also just reuse it in the > >pipeline by piping the output to more than one subsequent steps as > >needed. > > > >>>It is not used by other steps in this process > but I'm finding it handy as a reference to what > is in our content. I've extracted some key > information and the structure from the maps and have it all in one place. > > > > > 3) I want to execute the batch file. > > > 4) I also want to process all the XML files in a > > > given directory with the following: > > > 4a) Run an ant script against all of the files. > > > 4b) Run the resulting files through an > > > additional stylesheet to process the XML file further > > > >I don't have any experience calling ant scripts from xproc, but p:exec > >should work. Or, if xproc can do what the ant scripts are doing, you > >could reimplement the ant script. Are these scripts from the DITA-OT by > >any chance? > > > >>>No this is all unique stuff we are doing to > support the process of getting the files into the translation process. > > > > > 5) Zip all the resulting files together. > > > >You want an extension step, pxp:zip > >(http://xmlcalabash.com/docs/reference/pxp-zip.html). > > > > > > > > Below is the XPROC file I have. It works properly > > > now for step 1 and 2. When I add the processing > > > for step 3 I start getting new errors. I've also > > > noticed that it appears that XPROC wants the file > > > to exist before it executes, so I can't pass the > > > file that I store to the second stylesheet. Looks > > > like I have to some how allow the pipeline of the > > > XML generation just work and somehow fork an > > > additional store/file create operation. > > > >Looks like you have the right ideas, but you're not exploiting the > >pipeline capability. Is step "store-flat" necessary, or can you just use > >the output of "flatten-map" as input to "build-batfile"? > > > >In general, it appears you are using shell script idioms instead of > >pipeline idioms. In particular, it's never necessary to write a file > >from one pipeline step and read it (using document()) from another step. > >(Of course, if you need the file for other reasons, that's ok, but even > >so, within the pipeline you can just connect the ports rather than read > >the file in the consuming step). > > >>>Yes that is part of the realization I had > today and why I asked these questions. i was also > trying to take advantage of some stuff I had > already written for use as transformations in oXygen. > > >>>The ANT is there to unroll the oXygen change > tracking PIs. I had started with XSLT but then > came across the character escaping that is done > to handle the elements and attributes that can be > deleted. It actually was this step that stopped > the approach I was taking. There was getting to > be too many transformations that I was running > and I was looking for a way to scrpt this process > using oXygen as the framework for doing this. > > > >Depending on what your batch files are doing, xproc might be a natural > >fit, or you might be better off just orchestrating a few xslt > >conversions from a batch file. It took me a while to really "think in > >pipelines", but it can lead to very elegant solutions for complex xml > >processing. > > > >>>Well this is encouraging that I can continue > down this path with some possible course corrections. > > >>>Thanks > >>>..dan > > > > >Regards, > >--Paul > > > > > > <?xml version="1.0" encoding="UTF-8"?> > > > <p:declare-step xmlns:p="http://www.w3.org/ns/xproc" > > > name="Preprocess-translations" version="1.0" > > > > > > > > > > <p:documentation> > > > This XPROC script runs all the steps > > > required to take an Ixiasoft Localization kit > > > and modify the source files and organize > > > them into a directory structure and zip > > > that content to be sent on for localization. > > > </p:documentation> > > > > > > > > > > > > <p:input port="source"> > > > <p:document > > > > > href="file:/C:/Users/dan.vint/Desktop/DITAsource/fr-fr-loc-v1-source-maptesting/_mobile-dan1370798002064.ditamap"/> > > > </p:input> > > > <p:output port="result" sequence="true"> > > > <!-- Write the result to "store-flat" --> > > > <p:pipe port="result" step="store-flat"/> > > > <p:pipe port="result" step="store-batchfile"/> > > > </p:output> > > > > > > > > > > > > <!-- Process the DITAMAP file source to > > > travel all the included maps and topics > > > with stylesheet to flatten the > > > structure and gather information for later processing. --> > > > <p:xslt name="flatten-map"> > > > <p:input port="source"/> > > > <p:input port="stylesheet"> > > > <p:document > > > href="file:/C:/work/svn/scripts/translations/make-translations-map.xsl"/> > > > </p:input> > > > <p:input port="parameters" sequence="true"> > > > <p:empty/> > > > </p:input> > > > > > > </p:xslt> > > > > > > <p:store > > > > > href="file:/C:/Users/dan.vint/Desktop/DITAsource/fr-fr-loc-v1-source-maptesting/_flatmap2.xml" > > > > > > > > name="store-flat" > > > encoding="utf-8" > > > method="xml" > > > indent="true" > > > omit-xml-declaration="false"/> > > > > > > <!-- Process the flattend map file to > > > create a batch file the copies topics/maps/images into a > > > new file structure for zipping to send to translators > > > --> > > > > > > <!-- XPROC seems to check for the > > > existence of the imput file before the prvious step creates it --> > > > <p:xslt name="build-batfile"> > > > <p:input port="source"> > > > <!-- <p:document > > > > > href="file:/C:/Users/dan.vint/Desktop/DITAsource/fr-fr-loc-v1-source-maptesting/_flatmap2.xml"/> > > > --> </p:input> > > > > > > > > > <p:input port="stylesheet"> > > > <p:document > > > href="file:/C:/work/svn/scripts/translations/fileorg.xsl"/> > > > </p:input> > > > <p:input port="parameters" sequence="true"> > > > <p:empty/> > > > </p:input> > > > > > > </p:xslt> > > > > > > <p:store > > > > > href="file:/C:/Users/dan.vint/Desktop/DITAsource/fr-fr-loc-v1-source-maptesting/_copy-files.bat" > > > > > > > > name="store-batchfile" > > > method="text"/> > > > > > > <!-- Run Ant script to remove the > > > oXy_delete processing instructions --> > > > > > > <!-- Process all the topics and maps in > > > the source folder and write to a new folder --> > > > > > > <!-- Run the copy.bat file against processed files --> > > > <p:exec name="execute-batchfile" > > > > > command="C:\Users\dan.vint\Desktop\DITAsource\fr-fr-loc-v1-source-maptesting\_copy-files.bat"> > > > <p:input port="source"> > > > <p:empty/> > > > </p:input> > > > </p:exec> > > > > > > <p:sink/> > > > > > > > > > > > > <!-- Zip the files for translation --> > > > > > > > > > > > > </p:declare-step> > > > > > > I'm thinking I need to switch to some other > > > scripting language and then possibly call a few > > > steps in XPROC. I started down this path because > > > the people using what I build will have oXygen > > > already installed. I was trying to avoid having > > > to configure a lot of other tools to make this > > > process work. So this is why I'm trying to stay in XPROC. > > > > > > ..dan > > > > > > > > > > > > At 10:44 AM 8/9/2015, Imsieke, Gerrit, le-tex wrote: > > > >Serialization options in p:serialization refer to a port with the given > > > >name. In your pipeline, there is a commented-out step that your > > > >serialization settings should probably pertain to. You can add the same > > > >serialization options (indent, etc.) to p:store as to p:serialization. > > > >Instead of p:storing XML, you could output it on a named port with > > > >individual p:serialization settings and store it to the respective files > > > >in your invocation, e.g., -o flat=/path/to/flatmap2.xml > > > > > > > >Gerrit > > > > > > > >On 09.08.2015 19:35, Dan Vint wrote: > > > > > thanks. > > > > > > > > > > The error I get in oXygen is: > > > > > > > > > > err:XS0039 : A p:serialization specifies a non-existant port. It is a > > > > > static error if the port specified on the p:serialization is not the > > > > > name of an output port on the pipeline in which it appears or if more > > > > > than one p:serialization element is applied to the same port. > > > > > > > > > > Here is the code: > > > > > > > > > > <?xml version="1.0" encoding="UTF-8"?> > > > > > <p:declare-step xmlns:p="http://www.w3.org/ns/xproc" > > > > > name="Preprocess-translations" version="1.0" > > > > > > > > > > > > > > > > <p:documentation> > > > > > This XPROC script runs all the steps required to take an > > > > > Ixiasoft Localization kit > > > > > and modify the source files and organize them into a directory > > > > > structure and zip > > > > > that content to be sent on for localization. > > > > > </p:documentation> > > > > > > > > > > <p:serialization > > > > > port="store-flat" > > > > > encoding="utf-8" > > > > > method="xml" > > > > > indent="true" > > > > > omit-xml-declaration="false"/> > > > > > > > > > > <p:input port="source"> > > > > > <p:document > > > > > > > > > > > href="file:/C:/Users/dan.vint/Desktop/DITAsource/fr-fr-loc-v1-source-maptesting/_mobile-dan1370798002064.ditamap"/> > > > > > > > > > > </p:input> > > > > > <p:output port="result "> > > > > > <!-- Write the result to "store-flat" --> > > > > > <!-- <p:pipe port="result" step="store-flat"></p:pipe> > > > > > --> </p:output> > > > > > > > > > > > > > > > > > > > > <!-- Process the DITAMAP file source to travel all the included > > > > > maps and topics > > > > > with stylesheet to flatten the structure and gather > > > > > information for later processing. --> > > > > > <p:xslt name="flatten-map"> > > > > > <p:input port="source"/> > > > > > <p:input port="stylesheet"> > > > > > <p:document > > > > > > > href="file:/C:/work/svn/scripts/translations/make-translations-map.xsl"/> > > > > > </p:input> > > > > > <p:input port="parameters" sequence="true"> > > > > > <p:empty/> > > > > > </p:input> > > > > > > > > > > </p:xslt> > > > > > > > > > > <!-- <p:store > > > > > > > > > > > href="file:/C:/Users/dan.vint/Desktop/DITAsource/fr-fr-loc-v1-source-maptesting/_flatmap2.xml" > > > > > name="store-flat" /> --> > > > > > > > > > > > > > > > </p:declare-step> > > > > > > > > > > Ultimately I want to write the output to > > a specific file. I took out the > > > > > store information as I thought it might be contributing to the problem > > > > > > > > > > > > > > > > > > > > > > > > > At 04:46 PM 8/7/2015, Imsieke, Gerrit, le-tex wrote: > > > > >> I think it would help us help you if we saw the actual pipeline. > > > > >> > > > > >> On 08.08.2015 01:42, Dan Vint wrote: > > > > >> > Thanks for the pointer. So I tried to > > make se of this based upon some > > > > >> > examples I found and I'm getting a message about the port not being > > > > >> > bound. I'm using the result port, but > > nothing seems to be making this > > > > >> > work. I need to do more digging, thanks. > > > > >> > > > > > >> > .dan > > > > > > > > > > > > --------------------------------------------------------------------------- > > > > > Danny Vint > > > > > > > > > > Panoramic Photography > > > > > http://www.dvint.com > > > > > > > > > > voice: 619-647-5780 > > > > > > > > > > > > > > > > > >-- > > > >Gerrit Imsieke > > > >Geschäftsführer / Managing Director > > > >le-tex publishing services GmbH > > > >Weissenfelser Str. 84, 04229 Leipzig, Germany > > > >Phone +49 341 355356 110, Fax +49 341 355356 510 > > > >gerrit.imsieke@le-tex.de, http://www.le-tex.de > > > > > > > >Registergericht / Commercial Register: Amtsgericht Leipzig > > > >Registernummer / Registration Number: HRB 24930 > > > > > > > >Geschäftsführer: Gerrit Imsieke, Svea Jelonek, > > > >Thomas Schmidt, Dr. Reinhard Vöckler > > > > > > --------------------------------------------------------------------------- > > > Danny Vint > > > > > > Panoramic Photography > > > http://www.dvint.com > > > > > > voice: 619-647-5780 > > > > > > > > > > > --------------------------------------------------------------------------- > Danny Vint > > Panoramic Photography > http://www.dvint.com > > voice: 619-647-5780 > > >
Received on Tuesday, 11 August 2015 13:21:53 UTC