Re: A draft outline for the CSV2RDF document

On 22/05/14 09:51, Christopher Gutteridge wrote:
> I've done quite a bit of work in this area using our home grown tool and
> this is fine IF you can proscribe the column headings, the catch is that
> there's terms which cause problems.
>
> You can have characters in a CSV heading which are not legal in a URI.
> The most obvious of which is \n .. you can have a valid CSV which
> contains a new line inside a cell.
>
> I've also been bitten by the fact that RDF+XML can't express predicates
> which have a decimal as the first character after the last [/#]
> eg.
>
> <http://example.org/resource/dataset1> <http://example.org/ns/5starRating> "3".
> To my best knowledge this will upset RDF+XML as you have to write
> <rdf:Description rdf:about="http://example.org/resource/dataset1"
>     <myns:5starRating xmlns="http://example.org/ns/">3</myns:5starRating>
> </rdf:Description>
>
> and XML does not allow elements to start with 0-9. Sigh. That one really
> muddled me for a while.

:-|

> My point is that it will take some thought to take what's essentially a
> free text string and convert it to a part of a URI. I've also found that
> people who edit spreadsheets are pretty liberal about whitespace and
> capitalisation. My system turns the headers into camalCase, which covers
> some of these issues, but that may not work for all, especially
> non-latin headings.

Agreed - we do need a robust conversion from column name to URI in the 
case where no metadata is given.

The metadata is a place where a explicit predicate can be added to do 
better than the plain conversion:

e.g.
https://github.com/w3c/csvw/blob/gh-pages/examples/graph-templating.md

The other opportunity is to preprocess the CSV file from original form 
to a form as input to conversion ("5starRating" => "FiveStarRating").

	Andy

>
>
> On 22/05/2014 00:37, Gregg Kellogg wrote:
>> Gregg Kellogg
>> gregg@greggkellogg.net
>>
>> On May 19, 2014, at 7:14 AM, Andy Seaborne <andy@apache.org> wrote:
>>
>>> On 19/05/14 15:00, Ivan Herman wrote:
>>>>>> Generating a template, if none provided, would keep the
>>>>>> user-template driven mechanism and metadata-gdefineeneated
>>>>>> template mechanism in-step.  It would be clear that they aren't
>>>>>> alternatives with (potentially) capabilities in the direct roue
>>>>>> not in the template route.  You could get the generated template
>>>>>> and tweak it, for example.
>>>>>>
>>>> I would need an example to understand what you mean...
>>>>
>>> If the columns are "foo" and "bar" and no template is in the metadata
>>> then we define the process to be to create and use:
>>>
>>> -------------------------
>>> [
>>>    :foo "{foo}" .
>>>    :bar "{bar}" .
>>> ]
>>> -------------------------
>> +1; pretty much exactly what I came up with :)
>>
>> Gregg
>>
>>>     Andy
>>>
>>>
>>
>

Received on Thursday, 22 May 2014 10:55:33 UTC