Re: Call for Editors!

On Mar 20, 2014, at 10:53 AM, Ivan Herman <ivan@w3.org> wrote:

> Sorry if I sound like a broken record, but I would really like to see and understand the CSV->RDF use cases, also in terms of the people who are likely to use that. Learning CSV-LD or R2RML-CSV requires a learning curve. The question is which of the two is steeper for the envisaged user base.

The CSV-LD use case involves first constructing a CSV-LD mapping frame (basically a JSON-LD context for mapping the column headings and using that within the body of the JSON-LD as described in the Wiki).
* Subsequently, using the CSV-LD mapping frame, use it along with the CSV to generate a JSON-LD document.
* For other JSON-LD expressions, run additional framing steps
* For RDF, run JSON-LD to RDF conversion steps

For extra credit, when generating RDF, we can compact these steps so that RDF is streamed out as the CSV is processed, but running the JSON-LD to RDF algorithm on each record as it is mapped to JSON-LD.

The Direct mapping use case can be handled by recognizing the absence of a CSV-LD mapping frame and generating one based on the column headers treating each field as a string and issuing a sequence of unnamed (or row-fragment identifier named) JSON-LD nodes for each record.

The CSV-LD use case is most appropriate for someone already familiar with JSON-LD, and is looking to get CSV into that form. An understanding of JSON-LD is necessary to create a mapping frame, but is not required to use the CSV along with a provided mapping frame.

Gregg

> (I do not have anything against any of the two, but we may have to make a choice at some point if we go down that route...)
> 
> Ivan
> 
> On 20 Mar 2014, at 18:47 , Gregg Kellogg <gregg@greggkellogg.net> wrote:
> 
>> On Mar 20, 2014, at 10:39 AM, Juan Sequeda <juanfederico@gmail.com> wrote:
>> 
>>> If there is going to be a CSV to RDF mapping, shouldn't it be relatively close (if not almost equal to) R2RML. I foresee users doing RDB2RDF mappings with R2RML and having a few (or many) CSV files that they would like to map to RDF too. They would want to continue using the same tool. 
>>> 
>>> What we do is import the CSVs to a RDB, and then use R2RML. So as a user who needs to transform to RDF, I would want to have something almost equivalent to R2RML.
>> 
>> This certainly is a valid use case. I was considering what the impact on developers using these tools might be. If there is a single tool (and spec) which handles the relevant use cases, then it might simplify the life of developers. Nothing against R2RML, and if that's the chain a developer's working with, the same logic would indicate that having to use something like CSV-LD would be a burden.
>> 
>> Gregg
>> 
>>> Juan Sequeda
>>> +1-575-SEQ-UEDA
>>> www.juansequeda.com
>>> 
>>> 
>>> On Thu, Mar 20, 2014 at 12:08 PM, Gregg Kellogg <gregg@greggkellogg.net> wrote:
>>> On Mar 20, 2014, at 9:52 AM, Andy Seaborne <andy@apache.org> wrote:
>>> 
>>>> On 20/03/14 15:34, Ivan Herman wrote:
>>>>> 
>>>>> On 20 Mar 2014, at 16:03 , Juan Sequeda <juanfederico@gmail.com> wrote:
>>>>> 
>>>>>> I would say yes :)
>>>>>> 
>>>>>> 1) Direct Mapping is completely automatic
>>>>>> 2) R2RML is a manual.
>>>>> 
>>>>> Correct. The question for me is: do the use cases around justify the extra (non-trivial) effort of defining an R2RML-CSV? Remember that the definition of R2RML took over two years:-(
>>>> 
>>>> Caution warranted - it needs to be scoped downwards.  I hope (but can not prove) that the CSV mapping is less of a mountain.
>>>> 
>>>> CSV-LD is a R2RML(-ish) mapping.  Gregg's already started, so not 2 years :-)
>>> 
>>> Yes, CSV-LD is much like R2RML, but I think we could complete a spec in fairly short order.
>>> 
>>> For the direct mapping, this could be a default mapping done by automatically constructing a context along the lines Andy had suggested, and could fall out of that spec as well.
>>> 
>>> One consideration, is converting CSV files with a very large number of rows. The CSV-LD model would essentially create a document for each row, so converting to RDF could be streamed, but some provision for BNode identifiers would need to be made, so that if some value maps to a BNode, it would be preserved across records and not result in a new BNode being minted, even though it had the same identifier. This isn't really a problem, but it would mean specifying an algorithm that extended the existing JSON-LD conversion algorithms to the degree that the BNode identifier mapping would persist so that the conversion can be streamed.
>>> 
>>> Gregg
>>> 
>>>>      Andy
>>>> 
>>>>> 
>>>>> Ivan
>>>>> 
>>>>>> 
>>>>>> Direct Mapping bootstraps the R2RML.
>>>>>> 
>>>>>> Btw, I would be interested in participating in the CSV to RDF effort.
>>>>>> 
>>>>>> 
>>>>>> Juan Sequeda
>>>>>> +1-575-SEQ-UEDA
>>>>>> www.juansequeda.com
>>>>>> 
>>>>>> 
>>>>>> On Thu, Mar 20, 2014 at 6:44 AM, Andy Seaborne <andy@apache.org> wrote:
>>>>>> On 20/03/14 11:31, Ivan Herman wrote:
>>>>>> 
>>>>>> On 20 Mar 2014, at 11:40 , Andy Seaborne <andy@apache.org> wrote:
>>>>>> 
>>>>>> On 19/03/14 23:09, Jeni Tennison wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> Now that the first two of our documents are getting published as first public working drafts, we are moving on to the next stage of our work, namely looking at conversion from tabular data into other formats.
>>>>>> 
>>>>>> We have a wiki document here:
>>>>>> 
>>>>>>   https://www.w3.org/2013/csvw/wiki/Conversions
>>>>>> 
>>>>>> that describes in very broad terms what we need to do.
>>>>>> 
>>>>>> Specifically, we’re looking for volunteers to lead the efforts / edit four documents, specifying:
>>>>>> 
>>>>>>   * Conversion of CSV to RDF
>>>>>> 
>>>>>> RDF to RDF had two conversion documents.
>>>>>> 
>>>>>> I guess you meant RDB to RDF...
>>>>>> 
>>>>>> Yes.  Typo.  s/x42/x46/ -- only one bit out.
>>>>>> 
>>>>>>        Andy
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Ivan
>>>>>> 
>>>>>> 
>>>>>> (with no strong advocacy)
>>>>>> With hindsight, was it a good idea? Should we do the same?
>>>>>> 
>>>>>>   * Conversion of CSV to JSON and/or a browser API
>>>>>>   * Conversion of CSV to XML (possibly pending actually having a use case for this)
>>>>>>   * Conversion of CSV into a tabular data platform / framework / store (eg into a spreadsheet application or relational database or application like R)
>>>>>> 
>>>>>> Please step forward, by editing the wiki, to lead the work on one of these documents and/or volunteer to help someone else with the work that needs to go into it. Obviously everything will be discussed on the list, but lead editors are instrumental in framing those discussions.
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Jeni
>>>>>> --
>>>>>> Jeni Tennison
>>>>>> http://www.jenitennison.com/
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> ----
>>>>>> Ivan Herman, W3C
>>>>>> Digital Publishing Activity Lead
>>>>>> Home: http://www.w3.org/People/Ivan/
>>>>>> mobile: +31-641044153
>>>>>> GPG: 0x343F1A3D
>>>>>> FOAF: http://www.ivan-herman.net/foaf
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> ----
>>>>> Ivan Herman, W3C
>>>>> Digital Publishing Activity Lead
>>>>> Home: http://www.w3.org/People/Ivan/
>>>>> mobile: +31-641044153
>>>>> GPG: 0x343F1A3D
>>>>> FOAF: http://www.ivan-herman.net/foaf
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
> 
> 
> ----
> Ivan Herman, W3C 
> Digital Publishing Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> GPG: 0x343F1A3D
> FOAF: http://www.ivan-herman.net/foaf
> 
> 
> 
> 
> 

Received on Thursday, 20 March 2014 23:48:40 UTC