Re: Enumerating RDF::Format

Hi Nick!

On May 13, 2012, at 4:33 AM, Nicholas Humfrey wrote:

> Hello,
> 
> I have been using RDF.rb gem as part of dbpedialite.org for a couple of
> years now. I opened an issue about enumerating the registered list of
> RDF::Formats:
> 
> https://github.com/bendiken/rdf/issues/16

Sorry, I wasn't aware of the outstanding issue. As you probably know, the active gem is maintained on my fork (http://github.com/gkellogg/rdf), and Arto hasn't been too responsive at keeping in sync.

> Time has passed and this still isn't possible, so I thought I would try and
> come up with a patch. Specifically, I want iterate through the registered
> list of formats and get:
> 
> * The format name (eg 'RDF/XML' or 'N-Triples')
> * The default (most official) content type (eg 'text/turtle')
> * The default file suffix (eg .rdf or .trix)

I do something like this in my RDF Distiller (http://rdf.greggkellogg.net/distiller), where I need to present possible input and output formats using basically the following:

    RDF::Format.each.to_a.map(&:reader).compact.map(&:to_sym)

When loaded up with the linked data gem, this generates the following:

    [:ntriples, :nquads, :jsonld, :json, :microdata, :n3, :n3, :rdfa, :rdfa, :rdfa, :rdfa, :rdfa, :rdfxml, :trig, :trix, :turtle, :turtle] 

You can do the same thing with &:writer to get the list of available writers:

    [:ntriples, :nquads, :jsonld, :json, :n3, :n3, :rdfa, :rdfa, :rdfa, :rdfa, :rdfa, :rdfxml, :trig, :trix, :turtle, :turtle] 

(basically, the same, but without a microdata writer).

In the case of the Distiller, it uses either content-negotiation, or file extension to figure out the appropriate format to use. The sinatra-linkeddata gem can do this for you, or the sparql gem if you want to have a SPARQL endpoint too:

    require 'sinatra-respond_to'
    require 'sinatra-linkeddata'

    register Sinatra::RespondTo
    register Sinatra::LinkedData

This will then respond based on either the Accept header or the file extension and format the RDF::Queryable results using the appropriate writer.

Check out http://github.com/gkellogg/rdf-distiller and http://github.com/gkellogg/github-lod for some examples of doing this.

> This is used to generate a <link rel="alternate"> and hyperlinks to other
> formats in the HTML page.

You could do this with a variation of the previous Format.each clause:

    RDF::Format.file_extensions.keys

This will give you the file extensions of all loaded formats, which RespondTo should dispatch on.

Also, note that the RDF::Reader.open() will look at various things, including content type, file extension, specified format, and if necessary, content sniffing to try to find an appropriate reader. It's pretty good, but could be improved upon further, particularly for HTML serializations.

> Unfortunately the internal data structures currently make this difficult and
> there is no name for a format stored (other than deriving it from the class
> name). For the time being I have decided to resort to storing it in my own
> data structure:
> http://github.com/njh/dbpedialite/blob/master/lib/formats.rb

Yes, the internal structures don't make this easy. I'd certainly entertain a reasonable patch that made this easier. Perhaps just exposing the code I show here as RDF::Format class methods would be useful.

Gregg

> Is this a problem for anybody else?
> 
> 
> nick.
> 
> 
> http://www.bbc.co.uk/
> This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
> If you have received it in error, please delete it from your system.
> Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
> Please note that the BBC monitors e-mails sent or received.
> Further communication will signify your consent to this.
> 					
> 

Received on Sunday, 13 May 2012 20:13:24 UTC