Re: The dcterm/schema.org issue: a proposal to move forward from Rufus Pollock on 2014-10-03 (public-csv-wg@w3.org from October 2014)

From: Rufus Pollock <rufus.pollock@okfn.org>
Date: Fri, 3 Oct 2014 18:22:47 +0100
To: Ivan Herman <ivan@w3.org>
Cc: W3C CSV on the Web Working Group <public-csv-wg@w3.org>, Jeni Tennison <jeni@jenitennison.com>
Message-ID: <CAKssCpPMTCJntXmx_5LS=-YyD-xvxKLZyhH4X49fhgH1vxCU0Q@mail.gmail.com>
On 3 October 2014 10:42, Ivan Herman <ivan@w3.org> wrote:

> Dear all,
>
> I was wondering how to move ahead with the schema.org/dcmi question,
> their normativeness, etc; we seem to be in an impasse at this moment. Here
> is a strategy that may help us moving forward. (Some of the issues below
> were also motivated by giving some more thoughts to the RDF/JSON conversion
> document and its possible implementation.)
>
> 1. We define a small set of core properties that we consider to be
> essential in the metadata. "We define" means that we specify the terms to
> be used in the metadata specification as well as their data types and
> intended meaning.  There is already a set of such terms defined at the end
> of 3.4.2:
>
>         • created
>         • creator
>         • description
>         • language
>         • license
>         • modified
>         • provenance
>         • publisher
>         • rights
>         • rightsHolder
>         • source
>         • spatial
>         • subject
>         • temporal
>         • title
>

I think this seems pretty reasonable. The one item I always find a bit
awkward is creator (vs author) but that's just me ;-) (creator esp for data
always seems a bit odd whereas author is more neutral - cf
https://github.com/dataprotocols/dataprotocols/issues/130)


> we may start there (although I am not sure 'spatial' and 'temporal' should
> be part of such core set of terms). (I actually believe that these terms
> should be used _only_ as top level terms, at least in some cases; I am not
> sure it makes sense to add, say, a license to a specific cell in the table.)
>

I agree there to: I like them but they are not as regularly used or as
clear in their usage. At the same time i occasionally find them useful ...


> 2. The metadata already refers to @context. We would then say that
> (JSON-LD compatible) context entries MAY be added by the author to assign
> these terms to explicit URI-s in the vocabulary of their choice. The
> metadata document would also include an informative appendix with @context
> examples for a mapping on DCT or to schema.org. That being said, I
> foresee that many authors would not really bother, in fact, and just use
> the terms in JSON.
>

Agreed - seems very sensible.


> 3. The current 3.3.1 section in the metadata document (listing the Dublin
> Core terms) should be removed altogether.
>
> 4. The metadata document should also make it possible to use any set of
> properties anywhere in the metadata _in qualified form. We should also
> refer to a number of predefined prefixes; the best approach is, probably,
> to refer to the RDFa predefined prefix set. Ie, people may add properties
> of the form "dc:spatial" or "schema:author". However, authors may also add
> prefixes they want, besides those that are predefined. For many users, that
> is where it stops; others, who care about a proper RDF-ization of the
> metadata, may want to add the proper mapping of the prefixes to URI-s; we
> should provide a @context for the predefined prefixes as well.
>

Again seems very sensible :-)


> I believe this approach could work, and covers our issues:
>
>         - users (authors, clients) who do not really care about URI-s,
> linkage, RDF formats, or indeed vocabularies, could simply rely on the
> terms that are defined as standard terms in the metadata document. (I
> believe this is what Jeni proposed on our meeting 10 days ago.)
>
>         - users who care about binding the terms to outside vocabularies
> can choose to add a set of URI mappings through a @context; whether they
> choose DC, schema.org, or some application area specific vocabulary is
> not for the standard to define, although we make it easy to use the well
> known vocabularies. That also means that neither the DC reference nor the
> schema.org reference is normative.
>
>         - with the appropriate contexts the metadata is proper JSON-LD,
> ie, can serve as a 'glue' between the core CSV data and the Linked Data
> Cloud. (It is important to note that, afaik, a context can be delivered to
> a client via HTTP links, ie, the publisher of the data may not even care
> about the @context but the data store may ensure the JSON-LD aspects
> nevertheless.)
>
>         - the RDF mapping of the CSV content would rely on @context if
> present, otherwise these terms would be mapped against the "csv:"
> namespace. That means the mapping to RDF becomes clearly defined. (The
> current CSV->RDF document is hand-wawing around the top level terms right
> now, it is not really clean yet)
>
>         - users can use any other types of vocabularies at their heart's
> content, and the proper usage and mapping of these vocabularies can be
> ensured through the usage of qualified names to separate those from the
> terms that are defined by our standard.
>
>
> How does that sound?


Very sensible. I've also booted an issue for tracking this more:
https://github.com/w3c/csvw/issues/29 (you may want to add your proposal
there too).

Rufus


>
>
Received on Friday, 3 October 2014 17:23:15 UTC