W3C home > Mailing lists > Public > public-csv-wg@w3.org > April 2014

Re: simple weather observation example illustrating complex column mappings (ACTION-11)

From: Andy Seaborne <andy@apache.org>
Date: Thu, 03 Apr 2014 14:31:36 +0100
Message-ID: <533D62B8.305@apache.org>
To: "Tandy, Jeremy" <jeremy.tandy@metoffice.gov.uk>, "public-csv-wg@w3.org" <public-csv-wg@w3.org>
On 03/04/14 14:04, Tandy, Jeremy wrote:
> Some related thoughts about JSON and RDF conversions:
>
> - Conversion to JSON(sans-LD) would work with the mapping frame as
> defined by Gregg - there would simply be no @context section in the
> template

I think this is not a technology question but a perception question.

1/ Whether the appearance of @graph is acceptable.  A simple final step 
to produce some other JSON is possible

2/ Whether JSON-LD processing is then a requirement to get JSON(sans-LD).

> - If you want ntriples, ttl or other RDF encoding one could extend
> the processing pipeline methodology to convert the JSON-LD to the
> requisite form as defined in the JSON-LD Processing Algorithms and
> API
> <http://www.w3.org/TR/json-ld-api/#rdf-serialization-deserialization-algorithms>

My local requirement is being about to produce RDF without an JSON-LD 
stack involved. There's nothing wrong about JSON-LD - but is it to be 
the only way?  I want it to work on larger-than-RAM data - AKA database 
loading.  CSV is streamable.

To parse a JSON object, needs to see the end "}".  All the regular JSON 
processors, except one, I use will scan the the end for the "}" so 
reading the whole object, and that's the entire JSON at the outer level.

The exception is the Jena JSON SPARQL result processor.  It has to be it 
own JSON parser (c.f XML and SAX) and if has seen the details of the 
declarations of the result set, it will stream the rows.  These ar etwo 
object members of the top level JSON object.

There is nothing in JSON to require the declarations to come before the 
rows of the results so it has to fall back to reading in the whole 
result set before producing results for the application.  This is a big 
deal to some users.

If we can define the outcome of conversion, up to some level of 
complexity, in terms of RDF triples, then there can be different tools 
to get there and then CSV-LD can be one way of doing it.  To define the 
outcome by the algorithms of CSV/JSON-LD is a barrier to people not 
wanting that stack.  The algorithms of JSON-LD can be whole document 
processing.

This is something for the WG to decide soon.  I don't want to invest 
time and effort on spec'ing something that will be rejected on 
principle.  I realise that the WG as a group may wish to work on a 
single conversion approach to cover as many UC as possible.  If it is 
does decide that, then I'll work with whatever that spec proposes.

	Andy


>
>  Jeremy
>
>
>
>
> -----Original Message----- From: Tandy, Jeremy
> [mailto:jeremy.tandy@metoffice.gov.uk] Sent: 03 April 2014 13:14 To:
> Andy Seaborne; public-csv-wg@w3.org Subject: RE: simple weather
> observation example illustrating complex column mappings (ACTION-11)
>
> Is JSON-LD acceptable in place of a normal JSON encoding? Probably -
> thanks to the "zero-edit" capability of JSON-LD you can make JSON-LD
> look identical to JSON(sans-LD) ... even the @context reference can
> be done in an HTTP header.
>
> I hadn't intended to imply that the conversion was driven by OWL;
> only that one can supplement these complex cases where you want to
> annotate _every_ field in a column with the same information (e.g.
> unit of measurement) by defining local object properties with the
> necessary axioms.
>
> I like your idea of the processing pipeline ... I'll modify the
> example on GitHub to incorporate this string-formatting
> pre-processing step.
>
> I'll mark ACTION-11 as complete too.
>
> Jeremy
>
> -----Original Message----- From: Andy Seaborne
> [mailto:andy@apache.org] Sent: 03 April 2014 13:00 To:
> public-csv-wg@w3.org Subject: Re: simple weather observation example
> illustrating complex column mappings (ACTION-11)
>
> On 02/04/14 19:04, Tandy, Jeremy wrote:
>> All,
>>
>> (related action #11
>> <https://www.w3.org/2013/csvw/track/actions/11>)
>>
>> I've created an "Example" directory in the github repo
>> <https://github.com/w3c/csvw/tree/gh-pages/examples>, within which
>> I have placed the example requested by AndyS et al in today's
>> teleconference:
>>
>> simple-weather-observation
>> <https://github.com/w3c/csvw/blob/gh-pages/examples/simple-weather-obs
>>
>>
ervation.md>
>>
>> It provides: - CSV example - RDF encoding (in TTL) - JSON-LD
>> encoding (assuming my manual conversion is accurate) - CSV-LD
>> mapping frame (or at least my best guess)
>>
>> In the mapping frame I couldn't figure out how to construct the @id
>> for the weather observation instances as I wanted to use a
>> simplified form of the ISO 8601 date-time syntax used in the
>> "Date-time" column.
>>
>> Would be happy for folks to correct/amend what I've done :)
>>
>> AndyS / Greg - if this meets your need could you close the action?
>> (I left it in "pending review" state)
>>
>> Jeremy
>>
>
> Jeremy - thank you.
>
> It more than meets the action item for the RDF part at least.
>
> A question it raises for the JSON(sans -LD) conversion is whether the
> JSON-LD form is acceptable.  I have no in-sight there but it is
> something to test before going along a partcular spec'ing path.
>
> I wasn't thinking that the conversion would necessarily be driven by
> OWL, leaving that for tools beyond/better than the core spec.  It is
> nice to be aware of the possibility.
>
> If there are certain common conversions of ISO 8601 syntax for the
> date-time, we can include those conversion.  This one is drop certain
> legal URIs chars (":" and "-" timezone other than Z?)
>
> My feeling is that a real-world issue is that datetime strings are
> all too often not valid in the first place (systematically or not)
> and so the error handling is important.
>
> As CSV is a text format, string processing can be done before CSV2RDF
> conversion.  Clean-up is best done at that point a well.
>
> Seems like a processing pipeline model is emerging.
>
> Andy
>
>
>
Received on Thursday, 3 April 2014 13:32:08 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:21:39 UTC