W3C home > Mailing lists > Public > public-csv-wg@w3.org > February 2014

Re: CSV+ Direct Mapping candidate?

From: Richard Cyganiak <richard@cyganiak.de>
Date: Fri, 28 Feb 2014 19:44:13 +0000
Message-Id: <A5D2FF8A-BD6F-4234-A236-E9775BEC4886@cyganiak.de>
Cc: Niklas Lindström <lindstream@gmail.com>, Gregg Kellogg <gregg@greggkellogg.net>, "public-csv-wg@w3.org" <public-csv-wg@w3.org>
To: David Booth <david@dbooth.org>
David,

Yes, that's a perfect description of how Tarql works. One comment inline:

> On 28 Feb 2014, at 17:10, David Booth <david@dbooth.org> wrote:
> 
>> On 02/28/2014 10:15 AM, Richard Cyganiak wrote:
>>> On 26 Feb 2014, at 21:28, David Booth <david@dbooth.org> wrote:
>>> 
>>> [Excerpted from the public-linked-json@w3.org list]
>>> 
>>>> On 02/26/2014 12:07 PM, Niklas Lindström wrote:
>>>> Have you looked at TARQL [1]? [ . . . ] [1]:
>>>> https://github.com/cygri/tarql
>>> 
>>> Interesting!  It looks like a shortcut combination of an implied
>>> CSV Direct Mapping to RDF, followed by a SPARQL CONSTRUCT query to
>>> transform the native RDF model to the desired RDF model.
>> 
>> Not quite. There is no implied CSV Direct Mapping to RDF triples.
> 
> Right, directly mapped RDF triples are not materialized or specifically identified.  But what I meant was that tarql could be viewed as a shortcut over an implicit two step process: instead of going in two steps from A to B to C, tarql takes a shortcut from A to C, where A is the CSV, B is direct-mapped RDF, and C is the transformed RDF produced by the CONSTRUCT clause.

The interesting question—and one that I myself cannot answer with absolute certainty—is whether there is any value in having B.

Tarql is designed to test a hypothesis: that no one is really interested in B, and that most users want to get to C as quickly as possible and while learning as few new concepts as possible. Tarql assumes that B is pretty much just useless triple soup, and that one really wants to skip that and get to properly modelled RDF, even if that requires some effort (the effort of writing a Tarql mapping).

So, does anyone need B? I'm not sure, but I think not. Nothing is won by going from A to B.

Best,
Richard


> 
> For example, if companies.csv contains:
> 
>  Stock_ticker,CIK,LEI
>  IBM,cik1,lei1
>  MSFT,cik2,lei2
> 
> then, assuming the tarql --header option, the following example tarql query (taken from https://github.com/cygri/tarql )
> 
>  CONSTRUCT {
>    ?URI a ex:Organization;
>        ex:name ?NameWithLang;
>        ex:CIK ?CIK;
>        ex:LEI ?LEI;
>        ex:ticker ?Stock_ticker;
>  }
>  FROM <file:companies.csv>
>  WHERE {
>    BIND (URI(CONCAT('companies/', ?Stock_ticker)) AS ?URI)
>    BIND (STRLANG(?Name, "en") AS ?NameWithLang)
>  }
>  OFFSET 1
> 
> could be viewed as a shortcut equivalent to doing a direct mapping to RDF like this (following the style of the W3C relational Direct Mapping):
> 
>  <companies/ROWNUM=1>              # ROWNUM treated as primary key
>    <companies#ROWNUM> 1 ;          # ?ROWNUM is magic in tarql
>    <companies#Stock_ticker> "IBM" ;
>    <companies#CIK> "cik1" ;
>    <companies#LEI> "lei1" .
>  <companies/ROWNUM=2>
>    <companies#ROWNUM> 2 ;
>    <companies#Stock_ticker> "MSFT" ;
>    <companies#CIK> "cik2" ;
>    <companies#LEI> "lei2" .
> 
> followed by a regular SPARQL CONSTRUCT query like this:
> 
>  PREFIX ex: <companies#>
>  CONSTRUCT {
>    ?URI a ex:Organization;
>        ex:name ?NameWithLang;
>        ex:CIK ?CIK;
>        ex:LEI ?LEI;
>        ex:ticker ?Stock_ticker;
>  }
>  WHERE {
>    #####################################
>    _:r ex:ROWNUM ?ROWNUM ;             # This section is
>       ex:Stock_ticker> ?Stock_ticker ; # auto-generated
>       ex:CIK ?CIK ;                    #
>       ex:LEI ?LEI .                    #
>    #####################################
>    BIND (URI(CONCAT('companies/', ?Stock_ticker)) AS ?URI)
>    BIND (STRLANG(?Name, "en") AS ?NameWithLang)
>  }
> 
> tarql doesn't say anything about what the directly mapped RDF would look like, but by specifying the available bindings it does imply a certain information content that includes things like ?ROWNUM.  So although tarql isn't a ready-made candidate for a CSV+ Direct Mapping, I think it does give some good ideas and hints about what a CSV+ Direct Mapping might include, such as ?ROWNUM.
> 
> David
> 
>> 
>> Instead, the input CSV table is directly translated to a SPARQL
>> solution set, which is also a table. The solution set can then be
>> further manipulated using SPARQL filters, SPARQL bind, other SPARQL
>> constructs, and then finally turned into a set of triples using a
>> CONSTRUCT template.
>> 
>> Best, Richard
>> 
>> 
>> 
>>> Maybe the implied CSV Direct Mapping convention that it uses should
>>> be considered as a candidate for a CSV+ Direct Mapping.
>>> 
>>> David
>> 
>> 
>> 
>> 
Received on Friday, 28 February 2014 19:44:39 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:21:39 UTC