W3C home > Mailing lists > Public > xproc-dev@w3.org > February 2010

Dealing with encoding

From: Stefanie Haupt <st.haupt@gmail.com>
Date: Wed, 17 Feb 2010 11:07:30 +0100
To: xproc-dev@w3.org
Message-Id: <1266401250.3976.110.camel@stefanie-laptop>
Hi all,

I have some messy encoded HTML data which I want to process in a first
step with html tidy and then do some more operations controlled by a
xproc pipeline. Since it's more than one file I understand I use
p:http-request in combination with file protocol (since it's local
data). 
So I thought of using try/catch but the try group part either is ignored
or never true as the catch part is invoked for all files. Can you please
have a look and tell me what I'm doing wrong here? 

I'm using Calabash from within <oXygen/> XML Editor 11.1, build
2009121712 on Linux (Ubuntu).

<p:try>
  <p:group>
    <p:http-request encoding="windows-1252"/>
    <p:exec command="/usr/bin/tidy" source-is-xml="false"
          result-is-xml="true" wrap-result-lines="false"
          encoding="windows-1252">
      <p:with-option name="args" select="'--quiet yes --show-warnings no
--output-xml yes --bare yes --doctype omit --numeric-entities yes
--char-encoding win1252'"/>
    </p:exec>
    <p:exec name="iconv" command="/usr/bin/iconv" result-is-xml="true"
source-is-xml="true" wrap-result-lines="false"
              encoding="windows-1252">
      <p:with-option name="args" select="'-f WINDOWS-1252 -t UTF-8'"/>
    </p:exec>
  </p:group>

  <p:catch>
    <p:http-request/>
    <p:exec command="/usr/bin/tidy" source-is-xml="false"
          result-is-xml="true" wrap-result-lines="false">
       <p:with-option name="args" select="'--quiet yes --show-warnings
no --output-xml yes --bare yes --doctype omit --numeric-entities yes
--char-encoding utf8'"/>
    </p:exec>
 </p:catch>
</p:try>

Many thanks for your help!
Stefanie
Received on Wednesday, 17 February 2010 10:08:05 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 17 February 2010 10:08:06 GMT