Re: New metadata draft from Gregg Kellogg on 2014-05-21 (public-csv-wg@w3.org from May 2014)

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Wed, 21 May 2014 13:07:31 -0700
To: Jeni Tennison <jeni@jenitennison.com>
Cc: "rufus.pollock@okfn.org" <rufus.pollock@okfn.org>, public-csv-wg@w3.org
Message-Id: <7CB6AB4D-8D42-4D9E-807E-83F0495841FA@greggkellogg.net>

On May 21, 2014, at 10:02 AM, Jeni Tennison <jeni@jenitennison.com> wrote:

> Hi,
> 
> I’ve done some fairly substantial work on the metadata draft [1] to get the structure and content more towards where I think we want it to head, including trying to map the existing data package structures into something that makes (more) sense if we’re viewing the metadata documents as JSON-LD structures with a metadata vocabulary.
> 
> There’s still a lot of work to do (and loads of issues as you’ll see), but I think it’s a little more internally consistent now. Comments appreciated.

Some technical comments on the JSON-LD used within the document.

In ISSUE 1, you raise some questions within the JSON-LD context:

[[something here that maps the publisher onto an appropriate schema.org type?]]

Note that the JSON-LD context doesn't really describe dataranges beyond that for literals. So you map "publisher" to dc:publisher, and say that the default value is expected to be an IRI or BNode; you can't say that the type of that IRI should have a type of, say, schema:Person, if that's indeed what your attempting. For that, you need to revert to OWL or RDFS (or the schema.org rangeIncludes variation).

[[
don't know how to say the list is of Column objects
don't know how to detail the properties of the columns in context
]]

I modified this to the following: "columns": {"@type": "@id", "@container": "@list"}
Note that using @id within the term definition is not necessary if the term can already be expanded to an IRI (using @vocab, in this case). The @container: @list says that it is an ordered list, the @type: @id says that the default type for the list elements is expected to be an @id.

On ISSUE 2: a JSON-LD doc can always reference the context remotely. In fact, if the RDFS for the vocabulary is published at <http://w3.org/ns/table#>, the JSON-LD version of that vocabulary can also include the context itself, in addition to the RDFS definition. This would make a simple document look like the following:

{
  "@context": "http://w3.org/ns/table#",
  "@id": "tree-ops.csv",
  ...
}

ISSUE 19: This also means that the column values need an @id defined. You could avoid BNodes, by simply using a string which can be resolved as an IRI relative to the document location. If "name" had @type: @id, it would not be treated as a string, but as a property referencing an internally node. For example:

{
  "@context": [ "http://w3.org/ns/table#", {"@base": "tree-ops.csv#"}],
  "@id": "tree-ops.csv"
  ...,
  "columns": [
    {@id": "GID", "name": "GID", ...},
    {@id": "familyName", "name": "familyName", ...}
    {@id": "givenName", "name": "givenName", ...}
  ],
  "primaryKey: ["familyName", "givenName"],

If you made "name" a synonym for "@id", you wouldn't need to use @id either, but you'd need to be sure that "name" was used uniquely, as that could cause separate nodes to coalesce when turned into RDF. Alternatively, "name" could be a property of type @id, instead of string, in which case it would refer to a node having that @id definition.

ISSUE 37: I think you can use data-include in ReSpec to actually load the context definition into the document; should be able to syntax highlite it too.

Gregg

> Jeni
> 
> [1] http://w3c.github.io/csvw/metadata/
> --  
> Jeni Tennison
> http://www.jenitennison.com/
>

Received on Wednesday, 21 May 2014 20:08:06 UTC