W3C home > Mailing lists > Public > public-lod@w3.org > January 2010

Re: PHP RDF fetching code

From: Hugh Glaser <hg@ecs.soton.ac.uk>
Date: Thu, 28 Jan 2010 02:08:14 +0000
To: Tom Heath <tom.heath@talis.com>, Ian Davis <lists@iandavis.com>
CC: "nathan@webr3.org" <nathan@webr3.org>, Mischa Tuffield <mmt04r@ecs.soton.ac.uk>, "public-lod@w3.org" <public-lod@w3.org>
Message-ID: <EMEW3|4e9a192f2eed46e4d968f9d4fff9364cm0R28H02hg|ecs.soton.ac.uk|C786A20E.FC18%hg@ecs.soton.ac.uk>
On 27/01/2010 09:49, "Tom Heath" <tom.heath@talis.com> wrote:

> +1 for Moriarty, whether you're working with the Platform or not. Ian
> and the other contributors have done a great job - personally I'd
> start here before writing any new code.
Too true mate.

Now my next bit of pissing about.
Before writing it (if I can find the gumption).
Don't think this is in Moriarty, as the Talis Platform is, of course, well-behaved.

I run cURL, using an amended version of what was described before (as at the end of this message).

So now I need to deal with what comes back.
I actually hand it over to rapper, so would sort of like to know what the data is to improve the reliability by setting the rapper type parameter.
I am trying to avoid looking inside the file, although am happy to if someone can provide the code :-).
The Content-Type is unreliable Ė for example could (is likely to) be text/plain for a turtle file that someone has put on a standard web server.
So it is the usual problem of messing about with extensions, modified by extra information from the Content-Type.
Of course we need to worry about the final URL (curl_getinfo($ch)['url']), possibly as well as the requesting URI, as that might be where there is an extension.
So perhaps something that sets the Content-Type in curl_getinfo($ch) as best it can?

Any offers? (Pretty please!)
And maybe we can feed back to Moriarty, PEAR, etc, unless already there and I missed it.

On another worry, If the requesting URI does a 302 to a new URI, which then does 303, it looks an interesting challenge to capture the new URI as expected. I donít intend to do this at the moment, but if anyone has done that, ...

Enjoy.
Hugh

PHP much preferred.

Fetching code:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $_REQUEST['uri']);
curl_setopt($ch, CURLOPT_USERAGENT, "http://void.rkbexplorer.com/ submission agent 1.0");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Accept: application/rdf+xml, text/n3, text/rdf+n3, text/turtle, application/x-turtle, application/turtle, text/plain"));
$data = curl_exec($ch);
$info = curl_getinfo($ch);
curl_close($ch);

>
> My 2p worth :)
>
> Tom.
>
>
> 2010/1/26 Ian Davis <lists@iandavis.com>:
>> You may find something useful in my Moriarty project:
>>
>> http://code.google.com/p/moriarty/
>>
>> It's geared towards the Talis Platform but there is a lot of code in
>> there that has no dependencies on the platform, e.g.:
>>
>> http://code.google.com/p/moriarty/source/browse/trunk/httprequest.class.php
>>
>> some documentation for that class here:
>>
>> http://code.google.com/p/moriarty/wiki/HttpRequest
>>
>> Ian
>>
>>
>> ______________________________________________________________________
>> This email has been scanned by the MessageLabs Email Security System.
>> For more information please visit http://www.messagelabs.com/email
>> ______________________________________________________________________
>>
>
>
Received on Thursday, 28 January 2010 02:08:54 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:24 UTC