- From: Dan Brickley <danbri@google.com>
- Date: Fri, 3 Oct 2014 18:51:21 +0100
- To: Rufus Pollock <rufus.pollock@okfn.org>
- Cc: Ivan Herman <ivan@w3.org>, W3C CSV on the Web Working Group <public-csv-wg@w3.org>, Jeni Tennison <jeni@jenitennison.com>
On 3 October 2014 18:22, Rufus Pollock <rufus.pollock@okfn.org> wrote: > On 3 October 2014 10:42, Ivan Herman <ivan@w3.org> wrote: >> >> Dear all, >> >> I was wondering how to move ahead with the schema.org/dcmi question, their >> normativeness, etc; we seem to be in an impasse at this moment. Here is a >> strategy that may help us moving forward. (Some of the issues below were >> also motivated by giving some more thoughts to the RDF/JSON conversion >> document and its possible implementation.) >> >> 1. We define a small set of core properties that we consider to be >> essential in the metadata. "We define" means that we specify the terms to be >> used in the metadata specification as well as their data types and intended >> meaning. There is already a set of such terms defined at the end of 3.4.2: >> >> • created >> • creator >> • description >> • language >> • license >> • modified >> • provenance >> • publisher >> • rights >> • rightsHolder >> • source >> • spatial >> • subject >> • temporal >> • title I made a quick run through mapping these to schema.org. It's fairly good fit, • created: http://schema.org/dateCreated • creator: http://schema.org/creator or http://schema.org/author • description: http://schema.org/description • language: http://schema.org/language (definition applies to actions; could be generalized) • license: http://schema.org/license • modified: http://schema.org/dateModified • provenance: no direct. http://schema.org/evidenceOrigin is related. • publisher: http://schema.org/publisher • rights: no direct mapping • rightsHolder: http://schema.org/copyrightHolder • source: no direct mapping (how does this compare to provenance), not http://schema.org/source which is medical. • spatial: https://schema.org/spatial • subject: http://schema.org/about • temporal: https://schema.org/temporal • title: https://schema.org/name (rather than https://schema.org/title) (I'm interested in the difference between source vs provenance...) > I think this seems pretty reasonable. The one item I always find a bit > awkward is creator (vs author) but that's just me ;-) (creator esp for data > always seems a bit odd whereas author is more neutral - cf > https://github.com/dataprotocols/dataprotocols/issues/130) Yeah, I believe 'creator' in Dublin Core was an early (1996ish) replacement for 'author', to better fit images, media objects, cultural heritage artifacts etc. Schema.org has both 'creator' and 'author' fwiw. >> we may start there (although I am not sure 'spatial' and 'temporal' should >> be part of such core set of terms). (I actually believe that these terms >> should be used _only_ as top level terms, at least in some cases; I am not >> sure it makes sense to add, say, a license to a specific cell in the table.) > > > I agree there to: I like them but they are not as regularly used or as clear > in their usage. At the same time i occasionally find them useful ... We should make it clear that this is only a "starter kit", if people have reason to add more detail, that's all for the good. >> 2. The metadata already refers to @context. We would then say that >> (JSON-LD compatible) context entries MAY be added by the author to assign >> these terms to explicit URI-s in the vocabulary of their choice. The >> metadata document would also include an informative appendix with @context >> examples for a mapping on DCT or to schema.org. That being said, I foresee >> that many authors would not really bother, in fact, and just use the terms >> in JSON. > > Agreed - seems very sensible. >From the above it looks like a subset that wrote '@context': 'http://schema.org/' would work, and presumably people could do fancier things with their own context file. >> 3. The current 3.3.1 section in the metadata document (listing the Dublin >> Core terms) should be removed altogether. >> >> 4. The metadata document should also make it possible to use any set of >> properties anywhere in the metadata _in qualified form. We should also refer >> to a number of predefined prefixes; the best approach is, probably, to refer >> to the RDFa predefined prefix set. Ie, people may add properties of the form >> "dc:spatial" or "schema:author". However, authors may also add prefixes they >> want, besides those that are predefined. For many users, that is where it >> stops; others, who care about a proper RDF-ization of the metadata, may want >> to add the proper mapping of the prefixes to URI-s; we should provide a >> @context for the predefined prefixes as well. > > > Again seems very sensible :-) +1 Are we allowed to make a normative ref to http://www.w3.org/2011/rdfa-context/rdfa-1.1 ? Under what circumstances and it what ways (additive vs edits) does it change? >> >> I believe this approach could work, and covers our issues: >> >> - users (authors, clients) who do not really care about URI-s, >> linkage, RDF formats, or indeed vocabularies, could simply rely on the terms >> that are defined as standard terms in the metadata document. (I believe this >> is what Jeni proposed on our meeting 10 days ago.) >> >> - users who care about binding the terms to outside vocabularies >> can choose to add a set of URI mappings through a @context; whether they >> choose DC, schema.org, or some application area specific vocabulary is not >> for the standard to define, although we make it easy to use the well known >> vocabularies. That also means that neither the DC reference nor the >> schema.org reference is normative. >> >> - with the appropriate contexts the metadata is proper JSON-LD, >> ie, can serve as a 'glue' between the core CSV data and the Linked Data >> Cloud. (It is important to note that, afaik, a context can be delivered to a >> client via HTTP links, ie, the publisher of the data may not even care about >> the @context but the data store may ensure the JSON-LD aspects >> nevertheless.) >> >> - the RDF mapping of the CSV content would rely on @context if >> present, otherwise these terms would be mapped against the "csv:" namespace. >> That means the mapping to RDF becomes clearly defined. (The current CSV->RDF >> document is hand-wawing around the top level terms right now, it is not >> really clean yet) >> >> - users can use any other types of vocabularies at their heart's >> content, and the proper usage and mapping of these vocabularies can be >> ensured through the usage of qualified names to separate those from the >> terms that are defined by our standard. >> >> >> How does that sound? > > > Very sensible. I've also booted an issue for tracking this more: > https://github.com/w3c/csvw/issues/29 (you may want to add your proposal > there too). Thanks, Dan > Rufus > >> >>
Received on Friday, 3 October 2014 17:51:48 UTC