IRI Templates

As part of work on the CSV WG, I've put forward the concept of CSV-LD [1]. As I've discussed before, the idea is to use something like a JSON-LD frame to map column values in a CSV to turn it into JSON-LD.

I discussed the idea of IRI Templates (really @id templates) on the mailing list [2]. The idea is that fields in a CSV may be used to identify entities, but they may not explicitly include an identifier. In some cases, it may take two columns to determine a unique identifier, for example when a database dump has a composite primary key.

The idea I had is that one or more column values might be used to create a template for an IRI or Blank Node. This concept might be more generally useful for JSON-LD framing, but I wanted to get some reaction from this list. From the email:

[[[
I've been hand-waving around this, but one way to do this might be to extend the context definition to describe identifier templates:

{
 "region_id": {"@id": "_:{Sales Region}", "@type": "@idTemplate"}
}

I'm sure we can do much better, but the basic idea is that column values can be used within a template used to construct an IRI or BNode identifier, using some suitable rules. We could then use "region_id" in the frame, with the understanding that it will be expanded using the template defined in the context.

{
 "@id": "region_id",
 "@type": "ex:SalesRegion",
 "Sales Region": null,
 "ex:period": {
   "@type": "ex:SalesPeriod",
   "Quarter": null,
   "Sales": null
 }
}
]]]

The idea would be that if a term is of type @idTemplate, it could be used as a key or value (in this case, the value of @id), and it would be processed based on other properties of the associated node ("Sales Region" here). Obviously, this would require some normalization as well, so that the result would be legal. A more complete example would be the following:

{
  "@context": {
    "dc": "http://purl.org/dc/terms/",
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "ex": "http://example/",
    "Sales Region": "dc:title",
    "Quarter": "dc:title",
    "Sales": "rdf:value",
    "region_id": {"@id": "_:{Sales Region}", "@type": "@idTemplate"}
  },
  "@id": "region_id",
  "Sales Region": null,
  "ex:period": {
    "Quarter": null,
    "Sales": null
  }
}

I suppose that filling in the template term would be part of compaction, and the @idTemplate would allow such a term to be used as the value of @id. This could presumably be done in a CSV-LD spec, but it might be more generally useful as part of JSON-LD Framing.

Thoughts?

Gregg Kellogg
gregg@greggkellogg.net

[1] https://www.w3.org/2013/csvw/wiki/CSV-LD
[2] http://lists.w3.org/Archives/Public/public-csv-wg/2014Feb/0119.html

Received on Friday, 21 February 2014 01:51:12 UTC