W3C home > Mailing lists > Public > xproc-dev@w3.org > November 2010

Re: Loop over URLs with http-request...how?

From: Tony Rogers <tony@gonk.net>
Date: Tue, 9 Nov 2010 18:52:00 -0500
Cc: Philip Fennell <Philip.Fennell@marklogic.com>, XProc Dev <xproc-dev@w3.org>
Message-Id: <A1C98829-5A78-4D03-8E6C-CAED382F1204@gonk.net>
To: Tony Rogers <tony@gonk.net>
Whoa.  

Hacked at the XProc until I got something that works.  It's messy, but I got the complete set of results all wrapped up in a single XML document.  The result was nearly 30K lines of text long.  :)

Here's the (messy but functional) XProc script I ended up with:

<?xml version='1.0' encoding='UTF-8'?>
<p:declare-step 
	name="for-each-url" 
	xmlns:p="http://www.w3.org/ns/xproc" 
	xmlns:c="http://www.w3.org/ns/xproc-step"
	xmlns:cx="http://xmlcalabash.com/ns/extensions"
	xmlns:local="#empty"
	xmlns="#empty"
	version="1.0">

	<p:input port="source" primary="true" kind="document">
		<p:inline exclude-inline-prefixes="p c cx #default">
			<local:links xml:base="http://us.battle.net/sc2/en/">
				<local:link title="general" href="forum/40568/"/>
				<local:link title="wol-campaign" href="forum/13432/"/>
				<local:link title="terran" href="forum/13433/"/>
				<local:link title="protoss" href="forum/13434/"/>
				<local:link title="zerg" href="forum/13435/"/>
				<local:link title="multiplayer-and-esports" href="forum/13436/"/>
				<local:link title="custom-maps" href="forum/13437/"/>
				<local:link title="blizzcon" href="forum/692681/"/>
			</local:links>
		</p:inline>
	</p:input>
	<p:output port="result" primary="true" sequence="false">
		<!--<p:pipe step="store-results" port="result" />-->
	</p:output>
	
	
	<p:make-absolute-uris match="//@href">
		<p:with-option name="base-uri" select="//@xml:base" />
	</p:make-absolute-uris>
	
	

	<p:for-each>
		<p:iteration-source select="/local:links/local:link" />
		
		<p:rename match="/local:link" new-name="request" new-prefix="c" new-namespace="http://www.w3.org/ns/xproc-step"  />
		<p:add-attribute 
			match="c:request"
			attribute-name="method" 
			attribute-value="GET"
		/>
		<p:add-attribute 
			match="c:request"
			attribute-name="detailed" 
			attribute-value="true" 
		/>
		<p:namespace-rename apply-to="all" from="#empty" to="" />
		<p:delete match="//@title" />
		
		<p:http-request  />
	</p:for-each>
	
	
	<p:wrap-sequence wrapper="c:results" />
	<p:identity />
	
	<p:store 
		name="store-results"
		href="./results/results.xml"  
		encoding="UTF-8" 
		indent="true"  
	/>
</p:declare-step>


Thanks to Phillip for getting me 90% of the way there! :)

—Tony


On Nov 9, 2010, at 1:40 PM, Tony Rogers wrote:

> Alas, I tried this code and code errors.  
> 
> I'm using Oxygen 12 with whatever the embedded Calabash is—actually, let me check the version…0.9.23! :
> 
> So, originally you gave me the following:
> 
> On Nov 3, 2010, at 4:28 PM, Philip Fennell wrote:
> 
>> Tony,
>>  
>> This should do the trick:
>>  
>> <?xml version="1.0" encoding="UTF-8"?>
>> <p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
>>                         xmlns:c="http://www.w3.org/ns/xproc-step" 
>>                         name="for-each-url"
>>                         version="1.0">
>>             <p:input port="source">
>>                         <p:inline exclude-inline-prefixes="c p">
>>                                     <links xmlns="" xml:base="http://us.battle.net/sc2/en/">
>>                                                 <link title="general" href="forum/40568/"/>
>>                                                 <link title="wol-campaign" href="forum/13432/"/>
>>                                                 <link title="terran" href="forum/13433/"/>
>>                                                 <link title="protoss" href="forum/13434/"/>
>>                                                 <link title="zerg" href="forum/13435/"/>
>>                                                 <link title="multiplayer-and-esports" href="forum/13436/"/>
>>                                                 <link title="custom-maps" href="forum/13437/"/>
>>                                                 <link title="blizzcon" href="forum/692681/"/>
>>                                     </links>
>>                         </p:inline>
>>             </p:input>
>>             <p:output port="result"/>
>>             
>>             <p:make-absolute-uris match="/links/link/@href"/>
>>             
>>             <p:for-each>
>>                         <p:iteration-source select="/links/link"/>
>>                         
>>                         <p:variable name="uri" select="/link/@href"/>
>>                         
>>                         <p:identity name="request-template">
>>                                     <p:input port="source">
>>                                                 <p:inline>
>>                                                             <c:request method="GET" detailed="true"/>
>>                                                 </p:inline>
>>                                     </p:input>
>>                         </p:identity>
>>                         
>>                         <p:add-attribute name="set-uri" match="/c:request" attribute-name="href">
>>                                     <p:with-option name="attribute-value" select="$uri"/>
>>                         </p:add-attribute>
>>                         
>>                         <p:http-request name="request"/>
>>                         
>>             </p:for-each>
>>             
>>             <p:wrap-sequence wrapper="c:results"/>
>> </p:declare-step>
> 
> 
> 
> Originally you said you tried it with the first request, but that didn't work for me.  I figured it was probably something small making the difference, so I tried to modify it to make it work.  In the interest of reducing possible failure points, I tried to simplify the code.  The result is below.
> 
> However I still don't have a working pipeline, and I kinda need this to work in order to complete a school project.  I'm starting to get nervous.  I humbly beg for some help!  0:-/
> 
> 
> 
> <?xml version='1.0' encoding='UTF-8'?>
> <?xml-stylesheet type="text/css" href="/Users/amrogers/Developer/Applications/oxygen/frameworks/xproc/css/xproc.css"?>
> <p:declare-step 
>  	name="for-each-url" 
>  	xmlns:p="http://www.w3.org/ns/xproc" 
>  	xmlns:c="http://www.w3.org/ns/xproc-step"
>  	xmlns:cx="http://xmlcalabash.com/ns/extensions"
>  	xmlns:local="#empty"
>  	xmlns="#empty"
>  	version="1.0">
>  	
>  	
>  	<p:input port="source">
>  		<p:inline exclude-inline-prefixes="c">
>  			<links 
>  				xmlns="#empty"		
>  				xml:base="http://us.battle.net/sc2/en/"
>  			>
>  				<link title="general"		href="forum/40568/" />
>  				<link title="wol-campaign"	href="forum/13432/" />
>  				<link title="terran"		href="forum/13433/" />
>  				<link title="protoss"		href="forum/13434/" />
>  				<link title="zerg"			href="forum/13435/" />
>  				<link title="multiplayer-and-esports"	href="forum/13436/" />
>  				<link title="custom-maps"	href="forum/13437/" />
>  				<link title="blizzcon"		href="forum/692681/" />
>  			</links>
>  		</p:inline>
>  	</p:input>
>  	<p:output port="result" >
>  		<p:pipe step="store-results" port="result" />
>  	</p:output>
>  		
>  	<p:make-absolute-uris match="/links/link/@href" />
>  	
> 
>  	<p:for-each>
>  		<p:iteration-source select="/links/link" />
>  		
>  		<p:variable name="uri" select="/link/@href" />
>  		
>  		<p:http-request>
>  			<p:input port="source">
>  				<p:inline>
>  					<c:request 
>  						method="GET" 
>  						detailed="true"
>  						>
> 						<!-- for some reason Calabash complains that @href is not set for the c:request -->
> 						<p:with-option name="href" select="$uri" /> 
>  					</c:request>
>  				</p:inline>
>  			</p:input>
>  		</p:http-request>
>  	</p:for-each>
>  	
> 
>  	<p:wrap-sequence wrapper="c:results" />
>  	
>  	
>  	<p:store 
>  		name="store-results"
>  		href="./results/results.xml"  
>  		encoding="UTF-8" 
>  		indent="true" 
>  	/>
>  		
>  	
> </p:declare-step>
> 
> 
Received on Tuesday, 9 November 2010 23:52:30 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 9 November 2010 23:52:32 GMT