- From: Alex Muir <alex.g.muir@gmail.com>
- Date: Tue, 1 Dec 2009 09:31:00 +0000
- To: Stefanie Haupt <st.haupt@gmail.com>
- Cc: xproc-dev@w3.org
Hi,
Would reading the files in as unparsed text in a XSLT file work for
this case? I'm not sure how that would play out with tidying. Could
maybe wrap the input in an element.
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
xmlns:c="http://www.w3.org/ns/xproc-step"
xmlns:cx="http://xmlcalabash.com/ns/extensions"
name="ReadUnparsed">
<p:input port="source">
<p:document href="blank.xml"/>
</p:input>
<p:output port="result" sequence="true"/>
<p:variable name="source-folder" select="'../HTML/'"/>
<p:variable name="output-folder" select="'../XML/'"/>
<p:directory-list>
<p:with-option name="path" select="$source-folder">
<p:empty/>
</p:with-option>
</p:directory-list>
<p:for-each name="forEachFile">
<p:iteration-source select="//c:file"/>
<p:variable name="fileName" select="c:file/@name"/>
<p:xslt name="ReadUnparsedText">
<p:input port="source"/>
<p:input port="stylesheet">
<p:document href="../XSLT/ReadUnparsedText.xsl"/>
</p:input>
<p:with-param name="input_uri" select="concat($source-folder,$fileName)"/>
<p:input port="parameters">
<p:empty/>
</p:input>
</p:xslt>
Then in the XSLT will need to do something with unparsed-text.
<xsl:for-each select="tokenize(unparsed-text($input_uri,'ISO-8859-1'),
'\r?\n')">
Alex
On Mon, Nov 30, 2009 at 6:35 PM, Stefanie Haupt <st.haupt@gmail.com> wrote:
> Hi list,
>
> I'm using xproc to loop over a directory with various file types from
> which I only want to process the html files into a pipeline to tidy them
> up in a first step. The thing is, I don't know how to address them with
> p:data while being in the loop. There's no problem in running tidy for a
> single file where there is an absolute path given in p:data.
>
> What I'm using: Calabash 0.9.15 from within Oxygen 11, nothing changed
> from the start.
>
> This would be a typical directory:
> Directory
> file1.htm
> file2.htm
> someotherfile.htm
> picture.jpg
> anotherpicture.jpg
>
> This is my xproc (further below are some error-messages posted):
>
> <p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
> xmlns:c="http://www.w3.org/ns/xproc-step"
> xmlns:cx="http://xmlcalabash.com/ns/extensions" name="myPipeline">
> <p:input port="source" sequence="true"/>
> <p:output port="result" sequence="true">
> <p:pipe port="result" step="fileloop"/>
> </p:output>
>
> <!-- declare path -->
> <p:variable name="path"
>
> select="'file:///home/stefanie/Magisterarbeit/quellcode/xproc/mini-test-db/gba/80tage'">
> <p:empty/>
> </p:variable>
>
> <!-- list directory -->
> <p:directory-list name="directories">
> <p:with-option name="path" select="$path">
> <p:empty/>
> </p:with-option>
> </p:directory-list>
>
> <!-- show complete path -->
>
> <p:make-absolute-uris match="c:file/@name" name="uri">
> <p:with-option name="base-uri"
> select="p:resolve-uri(concat($path, '/',
> c:file/@name))"/>
> </p:make-absolute-uris>
>
> <!-- excluding other filetypes works great -->
> <p:filter select="//c:file[matches(@name, 'htm')]"
> name="filter"/>
>
> <!-- loop over files and do some magic html tidy. The only problem
> is: I can't get the files to
> tidy because I'm probably doing something wrong on p:data -->
>
> <p:for-each name="fileloop">
> <p:output port="result" sequence="true"/>
>
> <p:variable name="file" select="p:resolve-uri(concat($path, '/',
> c:file/@name))"/>
>
> <p:identity>
> <p:input port="source">
>
> <!-- <p:data href="file:///$file"/>-->
> <!-- <p:data href="file:///c:directory/c:file/@name"/>-->
> <!-- <p:data href="concat($path,'/',c:file/@name)"
> content-type="string"/>-->
> <!-- <p:data
> href="p:resolve-uri(concat($path,'/',c:file/@name))"/> -->
> <!-- <p:data
> href="file:///p:resolve-uri(concat($path,'/',c:file/@name))"/>-->
> <!-- <p:data href="file://$file"/>-->
> <!-- <p:data href="file:///c:file/@name"/>-->
> </p:input>
> </p:identity>
>
> <p:exec command="/usr/bin/tidy"
> source-is-xml="false"
> result-is-xml="true"
> wrap-result-lines="false">
>
> <p:with-option name="args" select="'--quiet yes
> --show-warnings no --doctype omit --numeric-entities yes --output-xml
> yes'"/>
> </p:exec>
>
> <p:unwrap match="c:result"/>
>
> <p:identity/>
> </p:for-each>
>
> </p:declare-step>
>
>
> As you can see I tried some things to get some input to p:data but all
> without success.
>
> This would be a typical error message for some attempt using file:/// in
> combination with a variable:
> E [Calabash XProc] XD0029 : XProc error err:XD0029 It is a dynamic error
> if the document referenced by a p:data element does not exist, cannot be
> accessed, or cannot be encoded as specified. /$file (No such file or
> directory)
> The same error is thrown with file:///c:file/@name:
> E [Calabash XProc] XD0029 : XProc error err:XD0029 It is a dynamic error
> if the document referenced by a p:data element does not exist, cannot be
> accessed, or cannot be encoded as specified. /c:file/@name (No such file
> or directory)
> An attempt using p:resolve-uri without file protocol:
> E [Calabash XProc] XD0029 : XProc error err:XD0029 It is a dynamic error
> if the document referenced by a p:data element does not exist, cannot be
> accessed, or cannot be encoded as specified. unknown protocol: p
> The same attempt using p:resolve-uri with file protocol:
> E [Calabash XProc] XD0029 : XProc error err:XD0029 It is a dynamic error
> if the document referenced by a p:data element does not exist, cannot be
> accessed, or cannot be encoded as
> specified. /p:resolve-uri(concat($path,'/',c:file/@name)) (No such file
> or directory)
>
> The 'best' result brings the attempt starting with concat because it
> contains the the working directory (at least):
> E [Calabash XProc] XD0029 : XProc error err:XD0029 It is a dynamic error
> if the document referenced by a p:data element does not exist, cannot be
> accessed, or cannot be encoded as
> specified. /home/stefanie/Magisterarbeit/quellcode/xproc/concat($path,'/',c:file/@name) (No such file or directory)
>
> I hope this is enough information to help, I really don't know where I'm
> going wrong. Please let me know if you need further information. Many
> thanks,
>
> Stefanie
>
>
> --
> Stefanie Haupt
>
>
>
>
--
Alex
https://sites.google.com/a/utg.edu.gm/alex
Received on Tuesday, 1 December 2009 09:31:41 UTC