Re: CSV2RDF redraft from Ivan Herman on 2014-03-26 (public-csv-wg@w3.org from March 2014)

From: Ivan Herman <ivan@w3.org>
Date: Wed, 26 Mar 2014 18:44:54 +0100
To: Andy Seaborne <andy@apache.org>
Cc: Jeni Tennison <jeni@jenitennison.com>, W3C CSV on the Web Working Group <public-csv-wg@w3.org>
Message-Id: <C2DF8822-FF0E-4166-992F-7831807E3C81@w3.org>

I think one thing is to have a header (or not) and the other is to have a metadata.

In other word, it would be possible to have a data set without any header, but with extra information in the metadata file. So the question may be: what is the minimum information that a metadata should have to make any sort of a meaningful conversion? I guess the answer is quite easy, but it also shows the fact that there may be a very close relationship between the metadata and the conversion

(Which puts CSV into a different situation than RDB's, hence we have to be careful in taking the RDB2RDF results too blindly.)

Ivan

On 26 Mar 2014, at 18:33 , Andy Seaborne <andy@apache.org> wrote:

> On 26/03/14 15:55, Jeni Tennison wrote:
>> Andy,
>> 
>> What about in the absence of headers (which aren’t in the core data model)?
> 
> Do we have examples of that?
> 
> I don't think that CSV files without headers nor annotation information are much use on the web.  To use the information, you need to know something.
> 
> Otherwise its not "publishing", it's "data exchange" between agreeing parties.
> 
> The best is "col_1", "col_2", ... c.f. http://shancarter.github.io/mr-data-converter/ then you have to add your own interpretation.
> 
> Should we include a header requirement, or at least a preference, in CDM?
> 
> 	Andy
> 
>> 
>> Jeni
>> 
>> ------------------------------------------------------
>> From: Andy Seaborne andy@apache.org
>> Reply: Andy Seaborne andy@apache.org
>> Date: 26 March 2014 at 14:58:40
>> To: CSV on the Web Working Group public-csv-wg@w3.org
>> Subject:  CSV2RDF redraft
>> 
>>> https://www.w3.org/2013/csvw/wiki/CSV2RDF
>>> 
>>> This is a conversion based on defining the triples produced, not the
>>> syntax used as output.
>>> 
>>> ------------
>>> Town,Population
>>> Southton,123000
>>> Northville,654000
>>> ------------
>>> 
>>> in the absence of any annotations (i.e. Core Data Model):
>>> 
>>> generates (if Turtle used - N-triples example in the wiki):
>>> 
>>> ------------
>>> @prefix : .
>>> @prefix csv: .
>>> 
>>> # Column information
>>> 
>>> csv:column [ csv:colName "Town" ;
>>> csv:colPredicate :Town ;
>>> csv:colIndex 1 ] ;
>>> csv:column [ csv:colName "Population" ;
>>> csv:colPredicate :Population ;
>>> csv:colIndex 2 ] ;
>>> .
>>> 
>>> # Data rows
>>> [ csv:row 1 ; :Town "Southton" ; :Population 123000 ] .
>>> [ csv:row 2 ; :Town "Northville" ; :Population 654000 ] .
>>> ------------
>>> 
>>> population becomes number by guessing from the data.
>>> 
>>> In that is uses one predicate per column, it is similar to CSV-lD in the
>>> absence of any @context.
>>> 
>>> If we can make the creation of the CSV-LD @context align to the minimal
>>> structure CSV2RDF uses, we wil at least have a common base line.
>>> 
>>> Gregg and I will discuss that as per the telecon.
>>> 
>>> Andy
>>> 
>>> 
>>> 
>>> 
>> 
>> --
>> Jeni Tennison
>> http://www.jenitennison.com/
>> 
> 
> 


----
Ivan Herman, W3C 
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
GPG: 0x343F1A3D
FOAF: http://www.ivan-herman.net/foaf

Received on Wednesday, 26 March 2014 17:45:23 UTC