W3C home > Mailing lists > Public > xproc-dev@w3.org > November 2010

RE: Loop over URLs with http-request...how?

From: Philip Fennell <Philip.Fennell@marklogic.com>
Date: Wed, 3 Nov 2010 13:28:59 -0700
To: Tony Rogers <tony@gonk.net>, XProc Dev <xproc-dev@w3.org>
Message-ID: <D20C296D14127D4EBD176AD949D8A75A46D85383@EXCHG-BE.marklogic.com>
Tony,

This should do the trick:

<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
                        xmlns:c="http://www.w3.org/ns/xproc-step"
                        name="for-each-url"
                        version="1.0">
            <p:input port="source">
                        <p:inline exclude-inline-prefixes="c p">
                                    <links xmlns="" xml:base="http://us.battle.net/sc2/en/">
                                                <link title="general" href="forum/40568/"/>
                                                <link title="wol-campaign" href="forum/13432/"/>
                                                <link title="terran" href="forum/13433/"/>
                                                <link title="protoss" href="forum/13434/"/>
                                                <link title="zerg" href="forum/13435/"/>
                                                <link title="multiplayer-and-esports" href="forum/13436/"/>
                                                <link title="custom-maps" href="forum/13437/"/>
                                                <link title="blizzcon" href="forum/692681/"/>
                                    </links>
                        </p:inline>
            </p:input>
            <p:output port="result"/>

            <p:make-absolute-uris match="/links/link/@href"/>

            <p:for-each>
                        <p:iteration-source select="/links/link"/>

                        <p:variable name="uri" select="/link/@href"/>

                        <p:identity name="request-template">
                                    <p:input port="source">
                                                <p:inline>
                                                            <c:request method="GET" detailed="true"/>
                                                </p:inline>
                                    </p:input>
                        </p:identity>

                        <p:add-attribute name="set-uri" match="/c:request" attribute-name="href">
                                    <p:with-option name="attribute-value" select="$uri"/>
                        </p:add-attribute>

                        <p:http-request name="request"/>

            </p:for-each>

            <p:wrap-sequence wrapper="c:results"/>
</p:declare-step>


I've tried it on the first two links, the whole lot might take some time. I was running it in oXygen 12 with Calabash.




Philip Fennell
Consultant
MarkLogic Corporation

88 Wood Street, London. EC2V 7RS

Mobile: +44 (0) 7824 830 866

email  Philip.Fennell@marklogic.com<mailto:Firstname.Lastname@marklogic.com>
web    www.marklogic.com<http://www.marklogic.com/>



From: xproc-dev-request@w3.org [mailto:xproc-dev-request@w3.org] On Behalf Of Tony Rogers
Sent: 03 November, 2010 7:20 PM
To: XProc Dev
Subject: Loop over URLs with http-request...how?

Hey everybody,

I'm working on a class project and I need to collect data for it.  I'm trying to use XProc to loop over a bunch of URLs and store the responses, but I am practically banging my head against the wall trying to figure out how to do this simple thing.  Could anybody give me a hint?  :)

Here's my pipeline:

<?xml version='1.0' encoding='UTF-8'?>
<pipeline
    xmlns="http://www.w3.org/ns/xproc"
                    xmlns:local="#empty"
    xmlns:p="http://www.w3.org/ns/xproc"
                    xmlns:c="http://www.w3.org/ns/xproc-step"
                    xmlns:cx="http://xmlcalabash.com/ns/extensions"
                    name="INFM298I-data-collection"
    version="1.0">


    <import href="http://xmlcalabash.com/extension/steps/library-1.0.xpl" />


    <group>
        <variable name="forum-url-prefix"   select="'http://us.battle.net/sc2/en/'"/>

        <variable name="general"            select="concat($forum-url-prefix,'forum/40568/')"/>
        <variable name="wol-campaign"       select="concat($forum-url-prefix,'forum/13432/')"/>
        <variable name="terran"             select="concat($forum-url-prefix,'forum/13433/')"/>
        <variable name="protoss"            select="concat($forum-url-prefix,'forum/13434/')"/>
        <variable name="zerg"               select="concat($forum-url-prefix,'forum/13435/')"/>
        <variable name="multiplayer-and-esports"
                                            select="concat($forum-url-prefix,'forum/13436/')"/>
        <variable name="custom-maps"        select="concat($forum-url-prefix,'forum/13437/')"/>
        <variable name="blizzcon"           select="concat($forum-url-prefix,'forum/692681/')"/>

        <for-each>
            <iteration-source
                select="
                    $general,
                    $wol-campaign,
                    $terran,
                    $protoss,
                    $zerg,
                    $multiplayer-and-esports,
                    $custom-maps,
                    $blizzcon
                    "
            />

            <http-request>
                <input port="source">
                    <inline>
                        <c:request method="GET" detailed="true"
                            href="http://us.battle.net/sc2/en/forum/40568/">

                        </c:request>
                    </inline>
                </input>
            </http-request>
        </for-each>
    </group>

    <sink />

</pipeline>
Received on Wednesday, 3 November 2010 20:29:29 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 3 November 2010 20:29:29 GMT