Re: Spec review request: CSV on the Web

On 19 April 2015 at 09:06, Ivan Herman <ivan@w3.org> wrote:
>
>> On 19 Apr 2015, at 01:07 , ashok malhotra <ashok.malhotra@oracle.com> wrote:
>>
>> Shouldn't it be possible to create Relational tables from tabular data?
>> That is, after all, a popular use of tabular data.
>> There are probably existing tools and standards to do this but I would
>> think it was worth at least a mention.
>>
>
> Hi Ashok,
>
> I am not sure what you mean. At the moment, the group has produced (apart from the specification of the generic metadata for CSV files) a specification to convert a CSV file into simple JSON and into RDF. Do you mean to specify converting a CSV file into an RDB Table, essentially, to produce a Database Schema? I certainly see that this may be useful but (a) you seem to suggest that such tools and standards already exist for this, (b) the group certainly does not have the right expertise to do that (let alone not being chartered for it:-).
>
> What do you think is possible and worth doing under these circumstances? It is really not clear to me...

Perhaps Ashok is alluding to the RDB2RDF efforts that he co-chaired,
and that our charter refers to, i.e.
http://www.w3.org/2013/05/lcsv-charter "The output of the mapping
mechanism for RDF MUST be consistent with either the RDF Direct
Mapping or R2RML so that if a table from a relational database is
exported as CSV and then mapped it produces semantically identical
data." The draft at
https://github.com/w3c/csvw/wiki/Deviations-from-the-charter and re
direct mappings https://github.com/w3c/csvw/issues/455 are pertinent,
articulating how the CSVW csv2rdf work corresponds to a basic RDB2RDF
direct mapping and also explains why we didn't try to push the idea of
using R2RML further, although we left in appropriate extensibility
hooks.

There is btw a rough experiment at
https://github.com/w3c/csvw/blob/gh-pages/examples/tests/scenarios/chinook/attempts/attempt-1/chinook.rml.ttl
(and nearby) that shows one way of applying a nonstandardized variant
of R2RML ("RML") to this problem, mapping in that case several linked
relational tables (serialized to CSV) into RDF. The Wiki link above
explains why we didn't feel this line of enquiry was ready for the
REC-track, although the approach we have taken puts in place several
important pieces that would be necessary for R2RML-based mapping
approaches to be fruitfully explored.

Although conceptually a bundle of CSVs can seem very much like a
relational database, the workflows, skills, incentives and tooling
surrounding their publication can vary significantly from the RDB
world. I believe we've found a reasonable and attractive tradeoff
between simplicity and expressivity: the CSVW design maps similarly
into both colloquial JSON and into reasonably expressive RDF, while
providing hooks for other mapping approaches (both textual e.g.
Mustache-style) and semweb (more complex RDF triple patterns via
R2RML/RML) to be attached when the expertise and interest is there.

But perhaps I am over-interpreting Ashok's "Shouldn't it be possible
to create Relational tables from tabular data?" question. Another
angle is to note that
http://www.w3.org/TR/tabular-data-model/#datatypes +
http://www.w3.org/TR/tabular-data-model/#dfn-table-foreign-keys are
probably the most important reference for considering w.r.t. RDB/SQL
dumps, and that it might be useful to have an "at a glance" paragraph
(in our Wiki or blog if not in specs) for SQL-oriented readers that
could help encourage people with the appropriate expertise to create
tools that turn CSVs + W3C CSVW metadata back into e.g.
MySQL/PostGres/Oracle/etc SQL database creation scripts. Perhaps
Gregg's recent post at
http://greggkellogg.net/2015/04/implementing-csv-on-the-web/ even does
this, i.e. a Postgres expert reading that blog post and building on
top of https://github.com/ruby-rdf/rdf-tabular might find it trivial
to write import/export tools.

Dan

Received on Sunday, 19 April 2015 10:30:17 UTC