W3C home > Mailing lists > Public > semantic-web@w3.org > October 2014

Re: scientific publishing process (was Re: Cost and access)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Mon, 06 Oct 2014 14:03:03 -0400
Message-ID: <5432D957.3020300@openlinksw.com>
To: public-lod@w3.org
CC: "semantic-web@w3.org" <semantic-web@w3.org>
On 10/6/14 12:48 PM, Peter F. Patel-Schneider wrote:
> It's not hard to query PDFs with SPARQL.  All you have to do is 
> extract the metadata from the document and turn it into RDF, if 
> needed. Lots of programs extract and display this metadata already. 


Having had 200+ (some-non-rdf-doc} to RDF document transformers built 
under my direct guidance, there are issues with your claim above:

1. The extractors are platform specific -- AWWW is about platform 
agnosticism (I don't want to mandate an OS for experiencing the power of 
Linked Open Data transformers / rdfizers)

2. It isn't solely about metadata  -- we also have raw data inside these 
documents confined to Tables, paragraphs of sentences

3. If querying a PDF was marginally simple, I would be demonstrating 
that using a SPARQL results URL in response to this post :-)

Possible != Simple and Productive.

We want to leverage the productivity and simplicity that AWWW brings to 
data representation, access, interaction, and integration.


Kingsley Idehen	
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this

Received on Monday, 6 October 2014 18:03:25 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:49:25 UTC