W3C home > Mailing lists > Public > xproc-dev@w3.org > February 2010

Dealing with encoding

From: Stefanie Haupt <st.haupt@gmail.com>
Date: Wed, 17 Feb 2010 11:07:30 +0100
To: xproc-dev@w3.org
Message-Id: <1266401250.3976.110.camel@stefanie-laptop>
Hi all,

I have some messy encoded HTML data which I want to process in a first
step with html tidy and then do some more operations controlled by a
xproc pipeline. Since it's more than one file I understand I use
p:http-request in combination with file protocol (since it's local
So I thought of using try/catch but the try group part either is ignored
or never true as the catch part is invoked for all files. Can you please
have a look and tell me what I'm doing wrong here? 

I'm using Calabash from within <oXygen/> XML Editor 11.1, build
2009121712 on Linux (Ubuntu).

    <p:http-request encoding="windows-1252"/>
    <p:exec command="/usr/bin/tidy" source-is-xml="false"
          result-is-xml="true" wrap-result-lines="false"
      <p:with-option name="args" select="'--quiet yes --show-warnings no
--output-xml yes --bare yes --doctype omit --numeric-entities yes
--char-encoding win1252'"/>
    <p:exec name="iconv" command="/usr/bin/iconv" result-is-xml="true"
source-is-xml="true" wrap-result-lines="false"
      <p:with-option name="args" select="'-f WINDOWS-1252 -t UTF-8'"/>

    <p:exec command="/usr/bin/tidy" source-is-xml="false"
          result-is-xml="true" wrap-result-lines="false">
       <p:with-option name="args" select="'--quiet yes --show-warnings
no --output-xml yes --bare yes --doctype omit --numeric-entities yes
--char-encoding utf8'"/>

Many thanks for your help!
Received on Wednesday, 17 February 2010 10:08:05 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:03:06 UTC