- From: Dan Brickley <danbri@google.com>
- Date: Wed, 5 Feb 2014 11:57:05 +0000
- To: "public-csv-wg@w3.org" <public-csv-wg@w3.org>
I thought I'd send this around, ... they're the notes I presented from last week. Mostly neutral general intro but a bit also about why I'm here. I will work on some more specifically Google-oriented use cases with colleagues, but in brief the Google interest here is in making it easier for data consumers to understand CSV data, and in making it easier for publishers to use a familiar format rather than having to worry about converting their data files into some fancy new format. Dan I work on structured data at Google, particularly schema.org (a partnership with Bing, Yahoo, Yandex), and at Google on various ways of getting data into the Knowledge Graph. For a high level schema.org perspective on tabular data, see the position paper to last year's Open Data on the Web W3C workshop, <http://www.w3.org/2013/04/odw/papers> <http://www.w3.org/2013/04/odw/odw13_submission_53.pdf> A few brief observations about our work here. 1. The "comma" in CSV The core focus here is on tabular data for the Web: metadata about tabular data. What shape it should take, what it should express, how it can be found, packaged and interpreted. In a way, naming this group after CSV is a little misleading; in practice tab-separated is equally important. But it serves as a marker: we are practically minded, and driven by a concern to know more about the millions of real tabular files in circulation that are most stereotypically expressed as CSV. But tab-separated is of course also in scope. 2. The relationship to RDF There are two roles RDF might play: as data model (and syntax, eg. json-ld) for the metadata about a table as a target data model that tabular data might be mapped into, i.e. tables to graphs It is far from a given that RDF is a perfect match for either. As a co-chair I'll say that these decisions will need to be grounded in practical use cases, and ideally running sample code. As a Google engineer I'll say that we consider RDF's basic graph data model key in both areas, but are wary of trying to squeeze too much into the graph data model. Sometimes the best representation of a table is a table. We'll come back to this in good time. 3. This could be an endless task Note that describing "tabular data on the Web" could be considered a larger problem than describing the structure of relational databases. We could be here forever. However Jeni, Ivan and I have no such intention. We need to do something useful quickly and pragmatically that addresses real world problems. To this end we will put a lot of focus on use cases and scenarios that come with actual CSV data, and then analyzing those situations and datasets to pull out their common features.
Received on Wednesday, 5 February 2014 11:57:37 UTC