Re: RDF.rb and format discovery

Hi Hellekin,

On Wed, Jun 30, 2010 at 11:03 AM, Hellekin O. Wolf
<hellekin@cepheide.org> wrote:
> Hi,
>
> I was looking into supporting more formats for FOAFSSL-ruby, including
> the recently released rdf-rdfa and rdf-n3 gems.
>
> But what I found looks like hell:
>
>  - there doesn't seem to be a reliable way of discovering the FOAF
> file format,
>  - different formats will fail with different errors,
>  - when no format is given, RDF::Graph won't detect the right one (and
> give unpredictable results)
>
> The original way of doing it in FOAFSSL-ruby is to try it, and
> fallback to a different format on failure.  It works, but it's so ugly
> my grand-mother died.  When I tried to add new formats, I had to find
> another solution.
>
> I went for the following (ugly) algorithm (now, my grand-mother is
> already dead):
>
>  1. lookup the file extension in the given WebID
>  2. lookup the Content-Type after an HTTP HEAD to the WebID
>  3. GET the file and identify it from its contents
>  4. fail if the format isn't known by now.
>
> That gives a pretty good image of a house of cards, if any.
>
> Any idea how to deal properly with auto-discovery of formats?

Perhaps you might want to take a look at the code in Rack::LinkedData
[1], which enumerates available RDF.rb serializers (enumerating
parsers, which you need for your use case, is done much the same way)
and implements server-side Linked Data content negotiation based on
that.

More broadly, though, what you need is a Linked Data client, and
RDF.rb does not (at least as yet) provide that. If you're expecting
`RDF::Graph.load(filename)` to automagically do HTTP content
negotiation for you, you're expecting too much from it. The only
criteria it uses to select a parser to use are the given file
extension or an explicit format specifier; and it only incidentally
supports URLs at all, through the magic of Ruby's open-uri standard
library, so it has zero awareness of any pertinent HTTP headers.

If you were to implement and contribute code for client-side Linked
Data content negotiation, that would be a welcome feature. But until
such a feature is available, you need to do it all manually; see
Rack::LinkedData's README for references to the relevant Linked Data
recommendations, e.g. [2], that contain details on the algorithm you
should use.

-- 
Arto Bendiken | @bendiken

[1] http://github.com/datagraph/rack-linkeddata
[2] http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/

Received on Wednesday, 30 June 2010 11:14:08 UTC