W3C home > Mailing lists > Public > xproc-dev@w3.org > November 2010

Re: Loop over URLs with http-request...how?

From: Philip Fennell <Philip.Fennell@marklogic.com>
Date: Tue, 9 Nov 2010 23:25:22 -0800
To: "'tony@gonk.net'" <tony@gonk.net>
CC: "'xproc-dev@w3.org'" <xproc-dev@w3.org>
Message-ID: <D20C296D14127D4EBD176AD949D8A75A44298F9A@EXCHG-BE.marklogic.com>
Tony,

I'll be without internet access for most of this morning. I'll have a look again this afternoon.

However, looking at your example, I can see the error message is correct. The way you construct the c:request won't work because the p:with-option is inside the p:inline and therfore won't be evaluated. You do need to construct the c:request, as I did, before the p:http-request and use the p:addiattribute step to add the href attribute.

When I said I'd tested it on the first link I found my pipeline was able to retrieve that page. However, when I ran it against all the links it was taking a very long time to finish.

Are you sure that all the pages can be retrieved. Have a look at the Calabash extensions. There's a timeout attribute for http-request. Try that and see if the pipeline completes.

Regards

Philip
--------------------------
Sent using BlackBerry

________________________________
From: Tony Rogers
To: Philip Fennell
Cc: XProc Dev
Sent: Tue Nov 09 10:40:26 2010
Subject: Re: Loop over URLs with http-request...how?

Alas, I tried this code and code errors.

I'm using Oxygen 12 with whatever the embedded Calabash is—actually, let me check the version…0.9.23! :

So, originally you gave me the following:

On Nov 3, 2010, at 4:28 PM, Philip Fennell wrote:

Tony,

This should do the trick:

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
                        xmlns:c="http://www.w3.org/ns/xproc-step"
                        name="for-each-url"
                        version="1.0">
            <p:input port="source">
                        <p:inline exclude-inline-prefixes="c p">
                                    <links xmlns="" xml:base="http://us.battle.net/sc2/en/">
                                                <link title="general" href="forum/40568/"/>
                                                <link title="wol-campaign" href="forum/13432/"/>
                                                <link title="terran" href="forum/13433/"/>
                                                <link title="protoss" href="forum/13434/"/>
                                                <link title="zerg" href="forum/13435/"/>
                                                <link title="multiplayer-and-esports" href="forum/13436/"/>
                                                <link title="custom-maps" href="forum/13437/"/>
                                                <link title="blizzcon" href="forum/692681/"/>
                                    </links>
                        </p:inline>
            </p:input>
            <p:output port="result"/>

            <p:make-absolute-uris match="/links/link/@href"/>

            <p:for-each>
                        <p:iteration-source select="/links/link"/>

                        <p:variable name="uri" select="/link/@href"/>

                        <p:identity name="request-template">
                                    <p:input port="source">
                                                <p:inline>
                                                            <c:request method="GET" detailed="true"/>
                                                </p:inline>
                                    </p:input>
                        </p:identity>

                        <p:add-attribute name="set-uri" match="/c:request" attribute-name="href">
                                    <p:with-option name="attribute-value" select="$uri"/>
                        </p:add-attribute>

                        <p:http-request name="request"/>

            </p:for-each>

            <p:wrap-sequence wrapper="c:results"/>
</p:declare-step>



Originally you said you tried it with the first request, but that didn't work for me.  I figured it was probably something small making the difference, so I tried to modify it to make it work.  In the interest of reducing possible failure points, I tried to simplify the code.  The result is below.

However I still don't have a working pipeline, and I kinda need this to work in order to complete a school project.  I'm starting to get nervous.  I humbly beg for some help!  0:-/



<?xml version='1.0' encoding='UTF-8'?>
<?xml-stylesheet type="text/css" href="/Users/amrogers/Developer/Applications/oxygen/frameworks/xproc/css/xproc.css"?>
<p:declare-step
name="for-each-url"
xmlns:p="http://www.w3.org/ns/xproc"
xmlns:c="http://www.w3.org/ns/xproc-step"
xmlns:cx="http://xmlcalabash.com/ns/extensions"
xmlns:local="#empty"
xmlns="#empty"
version="1.0">


<p:input port="source">
<p:inline exclude-inline-prefixes="c">
<links
xmlns="#empty"
xml:base="http://us.battle.net/sc2/en/"
>
<link title="general" href="forum/40568/" />
<link title="wol-campaign" href="forum/13432/" />
<link title="terran" href="forum/13433/" />
<link title="protoss" href="forum/13434/" />
<link title="zerg" href="forum/13435/" />
<link title="multiplayer-and-esports" href="forum/13436/" />
<link title="custom-maps" href="forum/13437/" />
<link title="blizzcon" href="forum/692681/" />
</links>
</p:inline>
</p:input>
<p:output port="result" >
<p:pipe step="store-results" port="result" />
</p:output>

<p:make-absolute-uris match="/links/link/@href" />


<p:for-each>
<p:iteration-source select="/links/link" />

<p:variable name="uri" select="/link/@href" />

<p:http-request>
<p:input port="source">
<p:inline>
<c:request
method="GET"
detailed="true"
>
<!-- for some reason Calabash complains that @href is not set for the c:request -->
<p:with-option name="href" select="$uri" />
</c:request>
</p:inline>
</p:input>
</p:http-request>
</p:for-each>


<p:wrap-sequence wrapper="c:results" />


<p:store
name="store-results"
href="./results/results.xml"
encoding="UTF-8"
indent="true"
/>


</p:declare-step>


Received on Wednesday, 10 November 2010 07:25:54 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 10 November 2010 07:25:56 GMT