W3C home > Mailing lists > Public > public-lod@w3.org > July 2009

Re: 303 vs. Content-Location / Was Re: .htaccess a major bottleneck to Semantic Web adoption

From: Hugh Glaser <hg@ecs.soton.ac.uk>
Date: Sun, 5 Jul 2009 22:47:54 +0100
To: Pierre-Antoine Champin <swlists-040405@champin.net>
CC: "public-lod@w3.org" <public-lod@w3.org>
Message-ID: <EMEW3|10693a6ee34440ac649e77e2978106ddl64Mm502hg|ecs.soton.ac.uk|C0EC%hg@ecs.soton.ac.uk>
Well done finding that.
Useful for people to know.
Glad you didn't find the script didn't work when it was called :-)

On 05/07/2009 22:41, "Pierre-Antoine Champin" <swlists-040405@champin.net> wrote:

My bad! It seems that option MultiViews is active on our Apache, so your
script was not even called :-/ It was Apache doing the trick on its own...

I just altered your script so that it map 'voc1' to 'voc1_.ext' (notice
the '_') in order not to trigger MultiViews. Your script works as
expected, issuing a 303.


Le 05/07/2009 21:38, Hugh Glaser a écrit :
> On 05/07/2009 20:39, "Pierre-Antoine Champin"<swlists-040405@champin.net>  wrote:
>> Nice one :)
>> works fine for me (tested on http://liris.cnrs.fr/~pchampin/tmp/ns).
> Thanks - great to get the feedback.
>> An interesting thing, though, is that
>>     http://champin.net/tmp/ns/voc1
>> does *not* redirect with a 303 to voc1.html or voc1.rdf, but instead
>> return a 200, but with a "Content-Location" header field containing the
>> appropriate location (i.e. voc1.html or voc2.html). This is therefore
>> not complying with the letter [HttpRange-14].
>> Note that I do believe that it is complying with the spirit of
>> [HttpRange-14], though. I interpret Content-Location as a kind of
>> shortcut-redirection, which spares the client and the server the burden
>> of a 2nd query whenever the server is able to produce the queried
>> content anyway. So it can be considered in a sense as an optimized 303
>> redirection. (I actually discussed that very matter a month ago with
>> Harry Halpin at ESWC).
>> So, is this actually a common practice in the LOD community? Does this
>> look like a good practice (it does to me)? If so, should HttpRange-14 be
>> amended to acknowledge that?
>>     pa
>> [HttpRange-14]
>> http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14
> Interesting indeed.
> Is it possible that your web server is doing the shortcut?
> Here are the two curls I have just done; one to a newly-created dir on my department server, and the other to your files.
> Mine does a 303 and yours delivers a 200 straight off:
> [rover:~] hg% curl -i -H "Accept: application/rdf+xml" http://users.ecs.soton.ac.uk/hg/pierre-test/voc1
> HTTP/1.1 303 See Other
> Date: Sun, 05 Jul 2009 21:38:25 GMT
> Server: Apache/2.2.6 (Unix) DAV/2 PHP/5.2.5 mod_perl/2.0.3 Perl/v5.8.8
> X-Powered-By: PHP/5.2.5
> Location: http://users.ecs.soton.ac.uk/hg/pierre-test/voc1.rdf
> Content-Length: 0
> Content-Type: text/html; charset=UTF-8
> [rover:~] hg% curl -i -H "Accept: application/rdf+xml" http://liris.cnrs.fr/~pchampin/tmp/ns/voc1
> HTTP/1.1 200 OK
> Date: Sun, 05 Jul 2009 20:33:36 GMT
> Server: Apache/2.0.54 (Unix) mod_ssl/2.0.54 OpenSSL/0.9.7g DAV/2 PHP/5.2.9 SVN/1.1.4 mod_wsgi/2.0c4 Python/2.5.2
> Content-Location: voc1.rdf
> Vary: negotiate,accept,accept-charset
> TCN: choice
> Last-Modified: Sun, 05 Jul 2009 18:58:18 GMT
> ETag: "8a4064-1b8-f8f60680;a114c40"
> Accept-Ranges: bytes
> Content-Length: 440
> Content-Type: application/rdf+xml
> <rdf:RDF
> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
> xmlns:owl="http://www.w3.org/2002/07/owl#"
> xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
> xmlns:foaf="http://xmlns.com/foaf/0.1/"
> xmlns:dc="http://purl.org/dc/terms/"
> <rdf:Description rdf:about="">
>    <rdfs:label>An RDF representation</rdfs:label>
>    <rdfs:seeAlso rdf:resource="../easypub" />
> </rdf:Description>
> </rdf:RDF>
> Best
> Hugh
>> Le 05/07/2009 16:16, Hugh Glaser a écrit :
>>> OK, I'll have a go :-)
>>> Why did I think this would be fun to do on a sunny Sunday morning that has
>>> turned into afternoon?
>>> Here are the instructions:
>>>    1.  Create a web-accessible directory, let's say foobar, with all your
>>> .rdf, .ttl, .ntriples and .html files in it.
>>>    2.  Copy lodpub.php and path.php into it.
>>>    3.  Access path.php from your web server.
>>>    4.  Follow the instruction to paste that text into .htaccess
>>>    5.  You can remove path.php if you like, it was only there to help you get
>>> the .htaccess right.
>>> That should be it.
>>> The above text and files are at
>>> http://www.rkbexplorer.com/blog/?p=11
>>> Of course, I expect that you can tell me all sorts of problems/better ways,
>>> but I am hoping it works for many.
>>> Some explanation:
>>> We use a different method, and I have tried to extract the essence, and keep
>>> the code very simple.
>>> We trap all 404 (File not Found) in the directory, and then any requests
>>> coming in for non-existent files will generate a 303 with an extension added,
>>> depending on the Accept header.
>>> Note that you probably need the leading "/" followed by the full path from
>>> the domain root, otherwise it will just print out the text "lodpub.php";
>>> (That is not what the apache specs seem to say, but it is what seems to
>>> happen).
>>> If you get "Additionally, a 404 Not Found error was encountered while trying
>>> to use an ErrorDocument to handle the request.", then it means that web
>>> server is not finding your ErrorDocument .
>>> Put the file path.php in the same directory and point your browser at it -
>>> this will tell you what the path should be.
>>> Note that the httpd.conf (in /etc/httpd/conf) may not let your override, if
>>> your admins have tied things down really tight.
>>> Mine says:
>>>       AllowOverride All
>>> Finally, at the moment, note that I think that apache default does not put
>>> the correct MIME type on rdf files, but that is a separate issue, and it
>>> makes no difference that the 303 happened.
>>> Best
>>> Hugh
>>> On 05/07/2009 01:52, "Pierre-Antoine Champin"<swlists-040405@champin.net>
>>> wrote:
>>>> Le 03/07/2009 15:14, Danny Ayers a écrit :
>>>>> 2009/7/2 Bill Roberts<bill@swirrl.com>:
>>>>>> I thought I'd give the .htaccess approach a try, to see what's involved in
>>>>>> actually setting it up.  I'm no expert on Apache, but I know the basics of
>>>>>> how it works, I've got full access to a web server and I can read the
>>>>>> online
>>>>>> Apache documentation as well as the next person.
>>>>> I've tried similar, even stuff using PURLs - incredibly difficult to
>>>>> get right. (My downtime overrides all, so I'm not even sure if I got
>>>>> it right in the end)
>>>>> I really think we need a (copy&    paste) cheat sheet.
>>>>> Volunteers?
>>>> (raising my hand) :)*
>>>> Here is a quick python script that makes it easier (if not completely
>>>> immediate). It may still requires a one-liner .htaccess, but one that (I
>>>> think) is authorized by most webmasters.
>>>> I guess a PHP version would not even require that .htaccess, but sorry,
>>>> I'm not fluent in PHP ;)
>>>> So, assuming you want to publish a vocabulary with an RDF and an HTML
>>>> description at http://example.com/mydir/myvoc, you need to:
>>>> 1. Make `myvoc` a directory at the place where your HTTP server will
>>>>       serve it at the desired URI.
>>>> 2. Copy the script in this directory as 'index.cgi' (or 'index.wsgi' if
>>>>       your server as WSGI support).
>>>> 3. In the same directory, put two files named 'index.html' and
>>>>       'index.rdf'
>>>> If it does not work now (it didn't for me),you have to tell your HTTP
>>>> server that the directory index is index.wsgi. In apache, this is done
>>>> by creating (if not present) a `.htaccess` file in the `myvoc`
>>>> diractory, and adding the following line::
>>>>        DirectoryIndex index.cgi
>>>> (or `index.wsgi`, accordingly)
>>>> There is more docs in the script itself. I think the more recipes
>>>> (including for other httpds) we can provide with the script, the more
>>>> useful it will be. So feel free to propose other ones.
>>>>     enjoy
>>>>      pa
Received on Sunday, 5 July 2009 21:49:03 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:20:50 UTC