- From: Dan Scott <denials@gmail.com>
- Date: Wed, 17 Jul 2013 13:47:34 -0400
- To: Karen Coyle <kcoyle@kcoyle.net>
- Cc: "public-schemabibex@w3.org" <public-schemabibex@w3.org>
On Tue, Jul 16, 2013 at 10:54 AM, Karen Coyle <kcoyle@kcoyle.net> wrote: > Dan, briefly before meeting: content size sounds great. I missed that. Let's > not worry about CreativeWork extent just yet. Karen, thank you for pointing me back to the thread at http://lists.w3.org/Archives/Public/public-schemabibex/2013Feb/0164.html. My apologies to all, again, for not being able to actually be active in the group during the Spring and therefore wasting some of your time while I'm catching up & covering familiar turf, but so it goes. I'm generally in agreement with the position that was expressed at various points through the thread that "if you can state True or False for a Boolean property, great; otherwise offer up a Text value and schema.org processors will do the best they can with it". That would lead to a possible recommendation for mark up practices like the following (assuming that http://www.w3.org/2011/webschema/track/issues/14 takes the logical step of aligning itself with RDF): ==== For abridged works, if you know the true/false value, you can set the "abridged" Boolean property: <div>Edition: <meta property="abridged" content="true" />Abridged</div> Conversely, if you only have MARC21 data to work with but you do have a 250 $a "edition statement", map that to the "bookEdition" Text property: <div>Edition: <span property="bookEdition"">Abridged</span></div> The "abridged" nature of a work may be reflected in its title: <div>Title: <span property="name">Theodicy, abridged</span></div> Or general notes (such as MARC21 500 fields), which you can surface as "description" properties: <div>Description: <span property="description">Revised and abridged</span></div> === I _think_ people could work with that. (Add in ONIX and other equivalents, of course!) To get some data, I crunched through the 2.5 million MARC records in our university consortium library system to find out where the string "abridged" lives, and it matches your findings back in February that it typically appears in the "general notes" section. tag | subfield | count -----+----------+------- 500 | a | 2190 245 | c | 580 245 | b | 492 250 | a | 437 245 | a | 306 510 | a | 193 505 | a | 28 246 | a | 27 520 | a | 20 509 | a | 15 503 | a | 15 250 | b | 15 740 | a | 14 For the 250 $a, arguably the canonical place to record whether an edition is abridged or not, the results in our database were woefully inconsistent: abridged ed | 129 abridged version | 26 unabridged | 20 [abridged ed | 14 abridged | 9 unabridged ed | 8 abridged edition | 8 [unaltered and unabridged ed | 6 complete and unabridged | 5 rev. and abridged ed | 5 (Yes, this is just one data point, from a primarily academic library system, so the data could all be skewed... I accept that!) For a few minutes, I had some hopes that the ISTC code, if encoded in the 024, would enable you to resolve some additional metadata (the "abridgedness" of a given work is explicitly encoded by the ISTC), but either the ISTC search engine appears to be defunct, or the example ISTC codes in the LoC MARC21 docs are invalid. So for MARC21-based library systems, I think we can pretty much lower our expectations. Most systems will not be able to definitively set a schema.org property for "abridged". They could use a hack like checking for the existence of the string "abridged" in a handful of fields, but it would very clearly be a hack. That said, given that ONIX has very clear encoding for the Abridged property, let's roll with it. There's no reason to hobble the usefulness of the schema.org Audiobook class just because one subset of the bibliographic world hasn't figured out how to reliably represent some useful metadata! I noted with interest that the OCLC mapping of ONIX 3.0 to MARC21 available from http://www.editeur.org/96/ONIX-and-MARC21/ just bails on most of the EditionType mappings, including Abridged / Unabridged; that acts as confirmation for me (and looks like a really valuable resource for future efforts). I'm somewhat tempted to say, "hey, let's map the ONIX 3.0 edition types in our schemabibex.org extension vocabulary as a more strongly typed Book.bookEdition property; that is, if the value is set to (say) http://schemabibex.org/editionType/ABR then we know authoritatively that it is abridged, otherwise we'll fall back to the base schema.org behaviour of taking whatever Text value we get". (This also makes me think that _somebody_ must have already published a vocabulary based on ONIX?) The "let's map ONIX edition types to a schemabibex.org extension (or an existing ONIX vocab)" would also offer a way forward for expressing all of the other values that ONIX has defined. My other temptation is to suggest "hey, Abridgement and Festschrift show up as http://www.productontology.org/id/Abridgement and http://www.productontology.org/id/Festschrift, we _could_ tell people to use additionalType="http://www.productontology.org/id/Abridgement" if they have a way of knowing that they are offering an abridged version, otherwise fall back to the Book.bookEdition field for (in the MARC21 world) the contents of the 250 field, or fall even further back to the general Thing.description fields for (in the MARC world) 500 fields. The advantage of the productontology approach is that it is already supported by schema.org, and it hits a subset of the interesting ONIX edition types (however, "Teacher's edition" doesn't show up on a quick search, for example). And given that http://www.productontology.org/id/Audiobook exists, we could use that as an additionalType too. Yielding to either of these temptations would avoid having to define a special property just for "abridged", and would leave just "readBy" as a new attribute. So I'm therefore tempted by both of these approaches... (On "readBy": wouldn't it be nice if "contributor" pointed at a new class, "Contributor", that would derive from Person but explicitly capture the nature of the contribution to this particular work? This would seem to be applicable not just to readBy, but any of the other long list of credits that you can imagine scrolling past you for minutes at the end of a movie as you wait patiently hoping for a last bonus blast of content for those who stuck it out to the very end).
Received on Wednesday, 17 July 2013 17:48:03 UTC