- From: Alex Muir <alex.g.muir@gmail.com>
- Date: Tue, 1 Dec 2009 09:31:00 +0000
- To: Stefanie Haupt <st.haupt@gmail.com>
- Cc: xproc-dev@w3.org
Hi, Would reading the files in as unparsed text in a XSLT file work for this case? I'm not sure how that would play out with tidying. Could maybe wrap the input in an element. <p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:c="http://www.w3.org/ns/xproc-step" xmlns:cx="http://xmlcalabash.com/ns/extensions" name="ReadUnparsed"> <p:input port="source"> <p:document href="blank.xml"/> </p:input> <p:output port="result" sequence="true"/> <p:variable name="source-folder" select="'../HTML/'"/> <p:variable name="output-folder" select="'../XML/'"/> <p:directory-list> <p:with-option name="path" select="$source-folder"> <p:empty/> </p:with-option> </p:directory-list> <p:for-each name="forEachFile"> <p:iteration-source select="//c:file"/> <p:variable name="fileName" select="c:file/@name"/> <p:xslt name="ReadUnparsedText"> <p:input port="source"/> <p:input port="stylesheet"> <p:document href="../XSLT/ReadUnparsedText.xsl"/> </p:input> <p:with-param name="input_uri" select="concat($source-folder,$fileName)"/> <p:input port="parameters"> <p:empty/> </p:input> </p:xslt> Then in the XSLT will need to do something with unparsed-text. <xsl:for-each select="tokenize(unparsed-text($input_uri,'ISO-8859-1'), '\r?\n')"> Alex On Mon, Nov 30, 2009 at 6:35 PM, Stefanie Haupt <st.haupt@gmail.com> wrote: > Hi list, > > I'm using xproc to loop over a directory with various file types from > which I only want to process the html files into a pipeline to tidy them > up in a first step. The thing is, I don't know how to address them with > p:data while being in the loop. There's no problem in running tidy for a > single file where there is an absolute path given in p:data. > > What I'm using: Calabash 0.9.15 from within Oxygen 11, nothing changed > from the start. > > This would be a typical directory: > Directory > file1.htm > file2.htm > someotherfile.htm > picture.jpg > anotherpicture.jpg > > This is my xproc (further below are some error-messages posted): > > <p:declare-step xmlns:p="http://www.w3.org/ns/xproc" > xmlns:c="http://www.w3.org/ns/xproc-step" > xmlns:cx="http://xmlcalabash.com/ns/extensions" name="myPipeline"> > <p:input port="source" sequence="true"/> > <p:output port="result" sequence="true"> > <p:pipe port="result" step="fileloop"/> > </p:output> > > <!-- declare path --> > <p:variable name="path" > > select="'file:///home/stefanie/Magisterarbeit/quellcode/xproc/mini-test-db/gba/80tage'"> > <p:empty/> > </p:variable> > > <!-- list directory --> > <p:directory-list name="directories"> > <p:with-option name="path" select="$path"> > <p:empty/> > </p:with-option> > </p:directory-list> > > <!-- show complete path --> > > <p:make-absolute-uris match="c:file/@name" name="uri"> > <p:with-option name="base-uri" > select="p:resolve-uri(concat($path, '/', > c:file/@name))"/> > </p:make-absolute-uris> > > <!-- excluding other filetypes works great --> > <p:filter select="//c:file[matches(@name, 'htm')]" > name="filter"/> > > <!-- loop over files and do some magic html tidy. The only problem > is: I can't get the files to > tidy because I'm probably doing something wrong on p:data --> > > <p:for-each name="fileloop"> > <p:output port="result" sequence="true"/> > > <p:variable name="file" select="p:resolve-uri(concat($path, '/', > c:file/@name))"/> > > <p:identity> > <p:input port="source"> > > <!-- <p:data href="file:///$file"/>--> > <!-- <p:data href="file:///c:directory/c:file/@name"/>--> > <!-- <p:data href="concat($path,'/',c:file/@name)" > content-type="string"/>--> > <!-- <p:data > href="p:resolve-uri(concat($path,'/',c:file/@name))"/> --> > <!-- <p:data > href="file:///p:resolve-uri(concat($path,'/',c:file/@name))"/>--> > <!-- <p:data href="file://$file"/>--> > <!-- <p:data href="file:///c:file/@name"/>--> > </p:input> > </p:identity> > > <p:exec command="/usr/bin/tidy" > source-is-xml="false" > result-is-xml="true" > wrap-result-lines="false"> > > <p:with-option name="args" select="'--quiet yes > --show-warnings no --doctype omit --numeric-entities yes --output-xml > yes'"/> > </p:exec> > > <p:unwrap match="c:result"/> > > <p:identity/> > </p:for-each> > > </p:declare-step> > > > As you can see I tried some things to get some input to p:data but all > without success. > > This would be a typical error message for some attempt using file:/// in > combination with a variable: > E [Calabash XProc] XD0029 : XProc error err:XD0029 It is a dynamic error > if the document referenced by a p:data element does not exist, cannot be > accessed, or cannot be encoded as specified. /$file (No such file or > directory) > The same error is thrown with file:///c:file/@name: > E [Calabash XProc] XD0029 : XProc error err:XD0029 It is a dynamic error > if the document referenced by a p:data element does not exist, cannot be > accessed, or cannot be encoded as specified. /c:file/@name (No such file > or directory) > An attempt using p:resolve-uri without file protocol: > E [Calabash XProc] XD0029 : XProc error err:XD0029 It is a dynamic error > if the document referenced by a p:data element does not exist, cannot be > accessed, or cannot be encoded as specified. unknown protocol: p > The same attempt using p:resolve-uri with file protocol: > E [Calabash XProc] XD0029 : XProc error err:XD0029 It is a dynamic error > if the document referenced by a p:data element does not exist, cannot be > accessed, or cannot be encoded as > specified. /p:resolve-uri(concat($path,'/',c:file/@name)) (No such file > or directory) > > The 'best' result brings the attempt starting with concat because it > contains the the working directory (at least): > E [Calabash XProc] XD0029 : XProc error err:XD0029 It is a dynamic error > if the document referenced by a p:data element does not exist, cannot be > accessed, or cannot be encoded as > specified. /home/stefanie/Magisterarbeit/quellcode/xproc/concat($path,'/',c:file/@name) (No such file or directory) > > I hope this is enough information to help, I really don't know where I'm > going wrong. Please let me know if you need further information. Many > thanks, > > Stefanie > > > -- > Stefanie Haupt > > > > -- Alex https://sites.google.com/a/utg.edu.gm/alex
Received on Tuesday, 1 December 2009 09:31:41 UTC