W3C home > Mailing lists > Public > semantic-web@w3.org > July 2012

Direct mapping for spreadsheets

From: (wrong string) čius <martynas@graphity.org>
Date: Fri, 27 Jul 2012 18:25:11 +0300
Message-ID: <CAE35Vmwx6bwy-kwCknxqVsY2vLfpsw8spkHY9gKtNerwToXWhg@mail.gmail.com>
To: Semantic Web <semantic-web@w3.org>
Hey all,

I was looking into RDFizing of spreadsheets (GRDDL on Microsoft's
SpreadsheetML, more specifically).

I know there are multiple tools, products, and mappings (such as
XLWrap, TopBraid Composer, Google Refine etc).
However I need a generic mapping, and none of them seem to do the job.
I just need to lift the spreadsheet data to the RDF level, and from
there I will be able to map it to a higher-level vocabulary using
SPARQL CONSTRUCT queries (with or without user assistance).

The closest thing to what I'm thinking about is the RDB to RDF direct
mapping [1].
Obviously spreadsheets do not have primary keys, column names and
datatypes. They seem however to be a more general case than the
relational, and they still have table (worksheet) names as well
row/column indices.

Trying to follow the R2R mapping, I came up with this basic example:

  @base <http://foo.example/spreadsheet.xlsx> .

  <#sheet1/1> <#sheet1/A> "content of A1" .
  <#sheet1/1> <#sheet1/B> "content of B1" .
  <#sheet1/2> <#sheet1/A> "content of A2" .
  <#sheet1/2> <#sheet1/B> "content of B2" .

This has an issue with addressing resources within packages, which has
been widely discussed [2] but not solved, AFAIK.

Has something like this been already attempted? I don't want to
reinvent the wheel.

[1] http://www.w3.org/TR/rdb-direct-mapping/
[2] http://lists.w3.org/Archives/Public/www-tag/2008Oct/0126.html

Martynas
graphity.org
Received on Friday, 27 July 2012 15:25:40 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 07:42:35 UTC