- From: Andy Seaborne <andy@apache.org>
- Date: Fri, 21 Feb 2014 11:04:11 +0000
- To: public-csv-wg@w3.org
On 21/02/14 10:17, Ivan Herman wrote: > > > Markus Lanthaler wrote: >> +CC public-csv-wg(-comments) >> >> >> On Friday, February 21, 2014 6:33 AM, David Booth wrote: >>> Personally, I would prefer not to overload JSON-LD this way. This >>> essentially amounts to a transformation rule. And although >>> transformation rules are useful and neede, I would prefer to have them >>> specified as a separate layer. >> >> I've raised an issue regarding IRI templates 2 years ago [1] and we >> concluded back then to "not add any normative language relating to IRI >> templates or other transformations". >> >> As is, framing is not really suitable for this as it expects the input to >> already be valid JSON-LD. I think what you want is a generic mechanism to >> map CSV to RDF. You can then easily serialize it (and frame it) in JSON-LD. >> >> I haven't followed the work in the CSV WG at all till now but it appears to >> me that there exists already an (almost complete) solution that you could >> leverage: R2RML [2] (I'm sure this has already been discussed). If you add a >> way to reference a JSON-LD context or frame, you are quite close to what you >> want to achieve I think: >> CSV -> RDF -> JSON-LD > > Just to reflect on this: the current thinking was actually the opposite (but we > are still early in the process). "some current thinking" CSV -> JSON-LD -> RDF puts a JSON-LD processor in the pipeline. Personally, I think we should define CSV -> RDF abstract data model and not tie things to the concrete syntax. Choice of JSON-LD is then one concrete syntax. It should be particularly nice for CSV from spreadsheets. N-triples/N-Quads is another concrete syntax and could be done in a streaming fashion over very large CSV files from database dumps with limited machine resources. Andy > Indeed, there is a need for a CSV->JSON > transformation, too, for users who simply want to use the data directly, without > going through RDF. Defining that JSON mapping by, essentially, defining a > CSV->JSON-LD mapping, and relying on a separate @context to yield RDF if > necessary seems to be an attractive proposition... > > Ivan > > >> >> >> Cheers, >> Markus >> >> >> [1] https://github.com/json-ld/json-ld.org/issues/108 >> [2] http://www.w3.org/TR/2012/REC-r2rml-20120927/ >> >> >> >> -- >> Markus Lanthaler >> @markuslanthaler >> >> >> >>> On 02/20/2014 08:50 PM, Gregg Kellogg wrote: >>>> As part of work on the CSV WG, I've put forward the concept of CSV-LD >>>> [1]. As I've discussed before, the idea is to use something like a >>>> JSON-LD frame to map column values in a CSV to turn it into JSON-LD. >>>> >>>> I discussed the idea of IRI Templates (really @id templates) on the >>>> mailing list [2]. The idea is that fields in a CSV may be used to >>>> identify entities, but they may not explicitly include an identifier. >>>> In some cases, it may take two columns to determine a unique >>>> identifier, for example when a database dump has a composite primary >>>> key. >>>> >>>> The idea I had is that one or more column values might be used to >>>> create a template for an IRI or Blank Node. This concept might be >>>> more generally useful for JSON-LD framing, but I wanted to get some >>>> reaction from this list. From the email: >>>> >>>> [[[ I've been hand-waving around this, but one way to do this might >>>> be to extend the context definition to describe identifier >>>> templates: >>>> >>>> { "region_id": {"@id": "_:{Sales Region}", "@type": "@idTemplate"} } >>>> >>>> I'm sure we can do much better, but the basic idea is that column >>>> values can be used within a template used to construct an IRI or >>>> BNode identifier, using some suitable rules. We could then use >>>> "region_id" in the frame, with the understanding that it will be >>>> expanded using the template defined in the context. >>>> >>>> { "@id": "region_id", "@type": "ex:SalesRegion", "Sales Region": >>>> null, "ex:period": { "@type": "ex:SalesPeriod", "Quarter": null, >>>> "Sales": null } } ]]] >>>> >>>> The idea would be that if a term is of type @idTemplate, it could be >>>> used as a key or value (in this case, the value of @id), and it would >>>> be processed based on other properties of the associated node ("Sales >>>> Region" here). Obviously, this would require some normalization as >>>> well, so that the result would be legal. A more complete example >>>> would be the following: >>>> >>>> { "@context": { "dc": "http://purl.org/dc/terms/", "rdf": >>>> "http://www.w3.org/1999/02/22-rdf-syntax-ns#", "ex": >>>> "http://example/", "Sales Region": "dc:title", "Quarter": >>>> "dc:title", "Sales": "rdf:value", "region_id": {"@id": "_:{Sales >>>> Region}", "@type": "@idTemplate"} }, "@id": "region_id", "Sales >>>> Region": null, "ex:period": { "Quarter": null, "Sales": null } } >>>> >>>> I suppose that filling in the template term would be part of >>>> compaction, and the @idTemplate would allow such a term to be used as >>>> the value of @id. This could presumably be done in a CSV-LD spec, but >>>> it might be more generally useful as part of JSON-LD Framing. >>>> >>>> Thoughts? >>>> >>>> Gregg Kellogg gregg@greggkellogg.net >>>> >>>> [1] https://www.w3.org/2013/csvw/wiki/CSV-LD [2] >>>> http://lists.w3.org/Archives/Public/public-csv-wg/2014Feb/0119.html >>>> >>>> >>>> >> >> >
Received on Friday, 21 February 2014 11:04:41 UTC