W3C home > Mailing lists > Public > xproc-dev@w3.org > December 2009

Re: p:data in p:for-each

From: Alex Muir <alex.g.muir@gmail.com>
Date: Tue, 1 Dec 2009 09:31:00 +0000
Message-ID: <88b533b90912010131p729e0af2wb564c0a3d5fcd4cd@mail.gmail.com>
To: Stefanie Haupt <st.haupt@gmail.com>
Cc: xproc-dev@w3.org
Hi,

Would reading the files in as unparsed text in a XSLT file work for
this case? I'm not sure how that would play out with tidying. Could
maybe wrap the input in an element.

<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
  xmlns:c="http://www.w3.org/ns/xproc-step"
xmlns:cx="http://xmlcalabash.com/ns/extensions"
  name="ReadUnparsed">


  <p:input port="source">
    <p:document href="blank.xml"/>
  </p:input>


  <p:output port="result" sequence="true"/>


  <p:variable name="source-folder" select="'../HTML/'"/>
  <p:variable name="output-folder" select="'../XML/'"/>

  <p:directory-list>
    <p:with-option name="path" select="$source-folder">
      <p:empty/>
    </p:with-option>
  </p:directory-list>

  <p:for-each name="forEachFile">

    <p:iteration-source select="//c:file"/>

    <p:variable name="fileName" select="c:file/@name"/>


    <p:xslt name="ReadUnparsedText">
      <p:input port="source"/>
      <p:input port="stylesheet">
        <p:document href="../XSLT/ReadUnparsedText.xsl"/>
      </p:input>
      <p:with-param name="input_uri" select="concat($source-folder,$fileName)"/>
      <p:input port="parameters">
        <p:empty/>
      </p:input>
    </p:xslt>


Then in the XSLT will need to do something with unparsed-text.
<xsl:for-each select="tokenize(unparsed-text($input_uri,'ISO-8859-1'),
'\r?\n')">


Alex




On Mon, Nov 30, 2009 at 6:35 PM, Stefanie Haupt <st.haupt@gmail.com> wrote:
> Hi list,
>
> I'm using xproc to loop over a directory with various file types from
> which I only want to process the html files into a pipeline to tidy them
> up in a first step. The thing is, I don't know how to address them with
> p:data while being in the loop. There's no problem in running tidy for a
> single file where there is an absolute path given in p:data.
>
> What I'm using: Calabash 0.9.15 from within Oxygen 11, nothing changed
> from the start.
>
> This would be a typical directory:
> Directory
>        file1.htm
>        file2.htm
>        someotherfile.htm
>        picture.jpg
>        anotherpicture.jpg
>
> This is my xproc (further below are some error-messages posted):
>
> <p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
> xmlns:c="http://www.w3.org/ns/xproc-step"
>  xmlns:cx="http://xmlcalabash.com/ns/extensions" name="myPipeline">
>  <p:input port="source" sequence="true"/>
>  <p:output port="result" sequence="true">
>    <p:pipe port="result" step="fileloop"/>
>  </p:output>
>
> <!-- declare path -->
>  <p:variable name="path"
>
> select="'file:///home/stefanie/Magisterarbeit/quellcode/xproc/mini-test-db/gba/80tage'">
>    <p:empty/>
>  </p:variable>
>
>  <!-- list directory -->
>  <p:directory-list name="directories">
>    <p:with-option name="path" select="$path">
>  <p:empty/>
>    </p:with-option>
>  </p:directory-list>
>
>      <!-- show complete path  -->
>
>      <p:make-absolute-uris match="c:file/@name" name="uri">
>        <p:with-option name="base-uri"
> select="p:resolve-uri(concat($path, '/',
>        c:file/@name))"/>
>      </p:make-absolute-uris>
>
>       <!-- excluding other filetypes works great -->
>       <p:filter select="//c:file[matches(@name, 'htm')]"
> name="filter"/>
>
>      <!-- loop over files and do some magic html tidy. The only problem
> is: I can't get the files to
>        tidy because I'm probably doing something wrong on p:data  -->
>
>      <p:for-each name="fileloop">
>        <p:output port="result" sequence="true"/>
>
>        <p:variable name="file" select="p:resolve-uri(concat($path, '/',
> c:file/@name))"/>
>
>        <p:identity>
>          <p:input port="source">
>
>           <!-- <p:data href="file:///$file"/>-->
>          <!--  <p:data href="file:///c:directory/c:file/@name"/>-->
> <!--           <p:data href="concat($path,'/',c:file/@name)"
> content-type="string"/>-->
>        <!--    <p:data
> href="p:resolve-uri(concat($path,'/',c:file/@name))"/>        -->
>        <!--    <p:data
> href="file:///p:resolve-uri(concat($path,'/',c:file/@name))"/>-->
>  <!--          <p:data href="file://$file"/>-->
>    <!--        <p:data href="file:///c:file/@name"/>-->
>          </p:input>
>        </p:identity>
>
>        <p:exec command="/usr/bin/tidy"
>          source-is-xml="false"
>          result-is-xml="true"
>          wrap-result-lines="false">
>
>          <p:with-option name="args" select="'--quiet yes
> --show-warnings no --doctype omit --numeric-entities yes --output-xml
> yes'"/>
>        </p:exec>
>
>        <p:unwrap match="c:result"/>
>
>        <p:identity/>
>      </p:for-each>
>
> </p:declare-step>
>
>
> As you can see I tried some things to get some input to p:data but all
> without success.
>
> This would be a typical error message for some attempt using file:/// in
> combination with a variable:
> E [Calabash XProc] XD0029 : XProc error err:XD0029 It is a dynamic error
> if the document referenced by a p:data element does not exist, cannot be
> accessed, or cannot be encoded as specified. /$file (No such file or
> directory)
> The same error is thrown with file:///c:file/@name:
> E [Calabash XProc] XD0029 : XProc error err:XD0029 It is a dynamic error
> if the document referenced by a p:data element does not exist, cannot be
> accessed, or cannot be encoded as specified. /c:file/@name (No such file
> or directory)
> An attempt using p:resolve-uri without file protocol:
> E [Calabash XProc] XD0029 : XProc error err:XD0029 It is a dynamic error
> if the document referenced by a p:data element does not exist, cannot be
> accessed, or cannot be encoded as specified. unknown protocol: p
> The same attempt using p:resolve-uri with file protocol:
> E [Calabash XProc] XD0029 : XProc error err:XD0029 It is a dynamic error
> if the document referenced by a p:data element does not exist, cannot be
> accessed, or cannot be encoded as
> specified. /p:resolve-uri(concat($path,'/',c:file/@name)) (No such file
> or directory)
>
> The 'best' result brings the attempt starting with concat because it
> contains the the working directory (at least):
> E [Calabash XProc] XD0029 : XProc error err:XD0029 It is a dynamic error
> if the document referenced by a p:data element does not exist, cannot be
> accessed, or cannot be encoded as
> specified. /home/stefanie/Magisterarbeit/quellcode/xproc/concat($path,'/',c:file/@name) (No such file or directory)
>
> I hope this is enough information to help, I really don't know where I'm
> going wrong. Please let me know if you need further information. Many
> thanks,
>
> Stefanie
>
>
> --
> Stefanie Haupt
>
>
>
>



-- 

Alex
https://sites.google.com/a/utg.edu.gm/alex
Received on Tuesday, 1 December 2009 09:31:41 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 1 December 2009 09:31:41 GMT