The dcterm/schema.org issue: a proposal to move forward

Dear all,

I was wondering how to move ahead with the schema.org/dcmi question, their normativeness, etc; we seem to be in an impasse at this moment. Here is a strategy that may help us moving forward. (Some of the issues below were also motivated by giving some more thoughts to the RDF/JSON conversion document and its possible implementation.)

1. We define a small set of core properties that we consider to be essential in the metadata. "We define" means that we specify the terms to be used in the metadata specification as well as their data types and intended meaning.  There is already a set of such terms defined at the end of 3.4.2:

	• created
	• creator
	• description
	• language
	• license
	• modified
	• provenance
	• publisher
	• rights
	• rightsHolder
	• source
	• spatial
	• subject
	• temporal
	• title

we may start there (although I am not sure 'spatial' and 'temporal' should be part of such core set of terms). (I actually believe that these terms should be used _only_ as top level terms, at least in some cases; I am not sure it makes sense to add, say, a license to a specific cell in the table.)

2. The metadata already refers to @context. We would then say that (JSON-LD compatible) context entries MAY be added by the author to assign these terms to explicit URI-s in the vocabulary of their choice. The metadata document would also include an informative appendix with @context examples for a mapping on DCT or to schema.org. That being said, I foresee that many authors would not really bother, in fact, and just use the terms in JSON.

3. The current 3.3.1 section in the metadata document (listing the Dublin Core terms) should be removed altogether.

4. The metadata document should also make it possible to use any set of properties anywhere in the metadata _in qualified form. We should also refer to a number of predefined prefixes; the best approach is, probably, to refer to the RDFa predefined prefix set. Ie, people may add properties of the form "dc:spatial" or "schema:author". However, authors may also add prefixes they want, besides those that are predefined. For many users, that is where it stops; others, who care about a proper RDF-ization of the metadata, may want to add the proper mapping of the prefixes to URI-s; we should provide a @context for the predefined prefixes as well.

I believe this approach could work, and covers our issues:

	- users (authors, clients) who do not really care about URI-s, linkage, RDF formats, or indeed vocabularies, could simply rely on the terms that are defined as standard terms in the metadata document. (I believe this is what Jeni proposed on our meeting 10 days ago.)

	- users who care about binding the terms to outside vocabularies can choose to add a set of URI mappings through a @context; whether they choose DC, schema.org, or some application area specific vocabulary is not for the standard to define, although we make it easy to use the well known vocabularies. That also means that neither the DC reference nor the schema.org reference is normative.

	- with the appropriate contexts the metadata is proper JSON-LD, ie, can serve as a 'glue' between the core CSV data and the Linked Data Cloud. (It is important to note that, afaik, a context can be delivered to a client via HTTP links, ie, the publisher of the data may not even care about the @context but the data store may ensure the JSON-LD aspects nevertheless.)

	- the RDF mapping of the CSV content would rely on @context if present, otherwise these terms would be mapped against the "csv:" namespace. That means the mapping to RDF becomes clearly defined. (The current CSV->RDF document is hand-wawing around the top level terms right now, it is not really clean yet)

	- users can use any other types of vocabularies at their heart's content, and the proper usage and mapping of these vocabularies can be ensured through the usage of qualified names to separate those from the terms that are defined by our standard.


How does that sound?

Ivan



[1] http://www.w3.org/TR/vocab-dcat/
[2] http://www.w3.org/2011/rdfa-context/rdfa-1.1



----
Ivan Herman, W3C 
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
GPG: 0x343F1A3D
WebID: http://www.ivan-herman.net/foaf#me

Received on Friday, 3 October 2014 09:42:45 UTC