- From: Philip Fennell <Philip.Fennell@marklogic.com>
- Date: Wed, 3 Nov 2010 13:28:59 -0700
- To: Tony Rogers <tony@gonk.net>, XProc Dev <xproc-dev@w3.org>
- Message-ID: <D20C296D14127D4EBD176AD949D8A75A46D85383@EXCHG-BE.marklogic.com>
Tony,
This should do the trick:
<?xml version="1.0" encoding="UTF-8"?>
<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
xmlns:c="http://www.w3.org/ns/xproc-step"
name="for-each-url"
version="1.0">
<p:input port="source">
<p:inline exclude-inline-prefixes="c p">
<links xmlns="" xml:base="http://us.battle.net/sc2/en/">
<link title="general" href="forum/40568/"/>
<link title="wol-campaign" href="forum/13432/"/>
<link title="terran" href="forum/13433/"/>
<link title="protoss" href="forum/13434/"/>
<link title="zerg" href="forum/13435/"/>
<link title="multiplayer-and-esports" href="forum/13436/"/>
<link title="custom-maps" href="forum/13437/"/>
<link title="blizzcon" href="forum/692681/"/>
</links>
</p:inline>
</p:input>
<p:output port="result"/>
<p:make-absolute-uris match="/links/link/@href"/>
<p:for-each>
<p:iteration-source select="/links/link"/>
<p:variable name="uri" select="/link/@href"/>
<p:identity name="request-template">
<p:input port="source">
<p:inline>
<c:request method="GET" detailed="true"/>
</p:inline>
</p:input>
</p:identity>
<p:add-attribute name="set-uri" match="/c:request" attribute-name="href">
<p:with-option name="attribute-value" select="$uri"/>
</p:add-attribute>
<p:http-request name="request"/>
</p:for-each>
<p:wrap-sequence wrapper="c:results"/>
</p:declare-step>
I've tried it on the first two links, the whole lot might take some time. I was running it in oXygen 12 with Calabash.
Philip Fennell
Consultant
MarkLogic Corporation
88 Wood Street, London. EC2V 7RS
Mobile: +44 (0) 7824 830 866
email Philip.Fennell@marklogic.com<mailto:Firstname.Lastname@marklogic.com>
web www.marklogic.com<http://www.marklogic.com/>
From: xproc-dev-request@w3.org [mailto:xproc-dev-request@w3.org] On Behalf Of Tony Rogers
Sent: 03 November, 2010 7:20 PM
To: XProc Dev
Subject: Loop over URLs with http-request...how?
Hey everybody,
I'm working on a class project and I need to collect data for it. I'm trying to use XProc to loop over a bunch of URLs and store the responses, but I am practically banging my head against the wall trying to figure out how to do this simple thing. Could anybody give me a hint? :)
Here's my pipeline:
<?xml version='1.0' encoding='UTF-8'?>
<pipeline
xmlns="http://www.w3.org/ns/xproc"
xmlns:local="#empty"
xmlns:p="http://www.w3.org/ns/xproc"
xmlns:c="http://www.w3.org/ns/xproc-step"
xmlns:cx="http://xmlcalabash.com/ns/extensions"
name="INFM298I-data-collection"
version="1.0">
<import href="http://xmlcalabash.com/extension/steps/library-1.0.xpl" />
<group>
<variable name="forum-url-prefix" select="'http://us.battle.net/sc2/en/'"/>
<variable name="general" select="concat($forum-url-prefix,'forum/40568/')"/>
<variable name="wol-campaign" select="concat($forum-url-prefix,'forum/13432/')"/>
<variable name="terran" select="concat($forum-url-prefix,'forum/13433/')"/>
<variable name="protoss" select="concat($forum-url-prefix,'forum/13434/')"/>
<variable name="zerg" select="concat($forum-url-prefix,'forum/13435/')"/>
<variable name="multiplayer-and-esports"
select="concat($forum-url-prefix,'forum/13436/')"/>
<variable name="custom-maps" select="concat($forum-url-prefix,'forum/13437/')"/>
<variable name="blizzcon" select="concat($forum-url-prefix,'forum/692681/')"/>
<for-each>
<iteration-source
select="
$general,
$wol-campaign,
$terran,
$protoss,
$zerg,
$multiplayer-and-esports,
$custom-maps,
$blizzcon
"
/>
<http-request>
<input port="source">
<inline>
<c:request method="GET" detailed="true"
href="http://us.battle.net/sc2/en/forum/40568/">
</c:request>
</inline>
</input>
</http-request>
</for-each>
</group>
<sink />
</pipeline>
Received on Wednesday, 3 November 2010 20:29:29 UTC