- From: Dan Brickley <danbri@danbri.org>
- Date: Thu, 3 Oct 2013 21:10:33 +0100
- To: Guha <guha@google.com>
- Cc: Niklas Lindström <lindstream@gmail.com>, Stéphane Corlosquet <scorlosquet@gmail.com>, Jeremy Tarling <jeremy.tarling@bbc.co.uk>, Andreas Gebhard <Andreas.Gebhard@gettyimages.com>, jean delahousse <delahousse.jean@gmail.com>, "public-vocabs@w3.org" <public-vocabs@w3.org>
On 3 October 2013 20:09, Guha <guha@google.com> wrote: > Just to be clear ... Schema.org 'assimilating' SKOS (or anything else) does > not gate anything. You can most certainly go ahead and > "publish pages about concepts described in a controlled vocabulary and to > describe the controlled vocabulary itself" > > today. Schema.org encourages the use of multiple vocabularies. Yes, it's quite possible already, and there's a lot of SKOS out there in RDF/XML, RDFa, Turtle, etc. using its own namespace. I do believe there's value in giving schema.org a SKOS-oriented notion of topic/concept/category, and I'd stick with the name "Concept" for it. A lot of people who publish SKOS authority data will probably want to use also the official W3C SKOS namespace (which is built-in to RDFa 1.1 btw, see skos: and skosxl: in http://www.w3.org/2011/rdfa-context/rdfa-1.1 ). Schema.org can add value by making it easier for these concept URLs to get deployed as controlled property values more widely across the Web. But let's try to walk through a couple of use cases. 1. JobPosting taxonomies We have http://schema.org/JobPosting with http://schema.org/occupationalCategory currently. (expected type: Text) "Category or categories describing the job. Use BLS O*NET-SOC taxonomy: http://www.onetcenter.org/taxonomy.html. Ideally includes textual label and formal code, with the property repeated for each applicable value." Over on the referenced site, there are a few links. http://www.onetcenter.org/taxonomy/2010/list.html seems to be the latest. There are also CSV and XLS downloadable versions. But no canonical url for each concept code. Looking at the HTML for the 2010 code list, we see: <tr> <td class="datapubrt" width="30%">13-1031.01</td> <td class="datapub" width="70%">Claims Examiners, Property and Casualty Insurance</td> </tr> <tr> <td class="datapubrt" width="30%">13-1031.02</td> <td class="datapub" width="70%">Insurance Adjusters, Examiners, and Investigators</td> </tr>...etc The CSV version has two columns (code and title). In this dataset there does appear to be hierarchy, but hidden in the structure of the names of the codes. It would be good for the Web if we could surface this structure and have the code list site tell us that insurance adjusters and claims examiners are related, and that regulatory affairs managers and compliance managers are both managers. <td class="datapubrt" width="30%">11-9199.01</td> <td class="datapub" width="70%">Regulatory Affairs Managers</td> </tr> <tr> <td class="datapubrt" width="30%">11-9199.02</td> <td class="datapub" width="70%">Compliance Managers</td> So imagine onetcenter.org start marking up with SKOS or schema.org SKOS or both. There are some choices to make about what the entity IDs are, whether they are different from the Web page, etc. Sticking with classic SKOS for now, <tr typeof="skos:Concept" resource="#concept11-9199.01"> <td class="datapub" width="30%" property="skos:notation">11-9199.01</td> <td class="datapub" width="70%" property="skos:prefLabel">Regulatory Affairs Managers</td> </tr> <tr typeof="skos:Concept" resource="#concept11-9199.02" > <td class="datapubrt" width="30%" property="skos:notation">11-9199.02</td> <td class="datapub" width="70%" property="skos:prefLabel">Compliance Managers</td> </tr> So now what do we do with our original schema.org property, which originally expected just textual values? Should we say it expects an URL, for example http://www.onetcenter.org/taxonomy/2010/list.html#concept11-9199.01 ? Or that it expects a http://schema.org/Concept (which like any other type could always be identified by URI/URL/IRI). Aside: this raises a general oddity in schema.org w.r.t. saying we expect the "URL" type. Some people have want to use 'expected type: URL' to distinguish the case where we have something a) like http://schema.org/trailer on Movie, where the trailer is a VideoObject described inline, versus b) http://schema.org/thumbnail on ImageObject, where we 'expect an URL' in one sense, but we also expect it to be an ImageObject. So one way or another we can extend our use of http://schema.org/occupationalCategory so that it properly accepts an URL like http://www.onetcenter.org/taxonomy/2010/list.html#concept11-9199.01 and we can encourage authority data publishers to use RDFa, SKOS and/or maybe Schema.org SKOS to describe their vocabulary. 2.) Second scenario I'll cover more quickly. The LRMI initiative came up with vocabulary which we now include in schema.org. It includes the notion that you describe the educational characteristics of information resources through aligning them with standard code lists, e.g. in the US, the Common Core, or in broader terms with the kind of topical codes we see in the SKOS world, such as LCSH. The key property here is http://schema.org/targetUrl which is documented as expecting an URL. You can see an example of it in http://www.cteonline.org/portal/default/Curriculum/Viewer/Curriculum?action=2&cmobjid=177674&refcmobjid=132904 which uses http://purl.org/ASN/resources/S103AD27 (which has linked machine readable versions, but not rdfa, skos or schema.org). I won't copy all the CTEOnline.org markup into this mail, but just an excerpt, <div class="contents"><span itemprop="educationalAlignment" itemscope="" itemtype="http://schema.org/AlignmentObject"><meta itemprop="name" content="ANR.C.C13.3 Use the scientific method to conduct agricultural experiments." /><meta itemprop="description" content="Use the scientific method to conduct agricultural experiments." /><meta itemprop="targetName" content="ANR.C.C13.3 Use the scientific method to conduct agricultural experiments." /><meta itemprop="targetDescription" content="Use the scientific method to conduct agricultural experiments." /><meta itemprop="alignmentType" content="teaches" /><meta itemprop="targetUrl" content="http://purl.org/ASN/resources/S103AE77" /> ... if you dig around the educational use case you see that topics for educational content, topics for bibliographic description, and job and skill taxonomies are all quite inter-related. It would be very positive if schema.org could send a clear message for how all these things fit together in terms of markup that is search-engine friendly. My inclination is to add a basic skos:Concept type and broader/narrower links, but not necessarily to reflect all of SKOS into schema.org. For schema.org we should focus on getting a lot more instance data linked to these SKOS-describable vocabularies. For that it helps if we can explicitly say ( within schema.org's self-contained framework ) which schema.org properties can be used with SKOS-like controlled lists. Adding a Concept type addresses this need. Are there any important scenarios missing? rNews++/storyline? events categories? Drupal 8? I'd like to get some agreement on motivations... Dan
Received on Thursday, 3 October 2013 20:11:01 UTC