- From: Niklas Lindström <lindstream@gmail.com>
- Date: Tue, 2 Jul 2013 23:54:10 +0200
- To: "public-schemabibex@w3.org" <public-schemabibex@w3.org>
- Message-ID: <CADjV5jeTAnRuX60foCohTnO0bjZ6bHNJd2ct4EfBUdrCQ1D2sQ@mail.gmail.com>
Hi all, I'm sure we can find a way of reconciling our current, varied, positions regarding instanceOf and commonEndeavour. The differences seem to relate to various MARC and FRBR experiences, or at least reactions to the different rigors, or indeed lack thereof, that have come out of these. I also think we can avoid inventing too much new thinking on these matters, since there is lots to reuse from the domain of resource descriptions. Model-wise, I think many of us (perhaps all) agree that the restrictive abstraction principles of WEMI separation, and the simpler but still divided Work/Instance counterpart evolving in BIBFRAME, are too rigid for schema.org (and possibly for cataloguing in general). However, many also see a great benefit in relating to common generalized notions of works from their specific forms, at least when a work is available in a multitude of formats. This notion is very useful both for facilitating cataloguers' workflow and for building services upon the data. As I mentioned a while ago [1], this is not so much about *abstraction* but *generalization* (also elaborated on in [2]). So let us neither prescribe nor prohibit. We should carefully consider what has been done in the wild, and especially practices stemming from, but not limited to, libraries. A good example of this is the Dublin Core Terms vocabulary [3], which has been heavily used for many years now, in lots of linked data scenarios. It is used in and recommended by many W3C specs (e..g SKOS, VoID and PROV) and is the base for many community vocabularies, such as BIBO. Its terms are used in lots, if not most, of the datasets in the Linked Open Data cloud. If there is any stable core in the plethora of bibliographic vocabularies, I'd say DC terms is it. And it gets by with (probably because of) quite a minimal specification (just like schema.org). I therefore suggest that we consider 'isFormatOf', explicitly based on 'dcterms:isFormatOf' [4], as a replacement for, or indeed a unification of, both the 'instanceOf' and 'commonEndeavour' proposals (and possibly content/carrier). Dublin Core defines 'isFormatOf' as: A related resource that is substantially the same as the described resource, but in another format. The things being in specific formats are representations, manifestations or instances. This property can be used both to relate between different formats (similar to 'commonEndeavour'), and for linking from a specific format to a generalized notion, such as an expression or work (similar to 'instanceOf'). The latter use of 'dcterms:isFormatOf' is quite common, using the pattern of linking different representations (e.g. in HTML or PDF for digital representations, or hardcover or paperback for physical books) to a general resource which they represent. Examples of this can be seen e.g. in legislation.gov.uk [5]. (For specifying the kind of format, schema:bookFormat is applicable, as is the more general dcterms:format property). As for the actual name, we could include 'isFormatOf' as is (and possibly its inverse, 'hasFormat'). Or we could relabel it somehow. The name is important, but only instrumentally so. The most important thing is to find a common meaning, and to do so we should base it on existing usage. (I'd also like to note that the solid proposal we do have on the table, 'hasPart'/'isPartOf', correlates very much to the existing Dublin Core properties of the same name (as has also been discussed). I do think we should mention that in the wiki page. I can address that unless anyone objects, following the pattern of the Datasets proposal [6]. In fact, if we can find a common ground in (at least parts of) the Dublin Core terms, we can also continue to import some other terms, such as 'isVersionOf', 'references' and 'source', if needed.) Regarding the necessity of an abstract class, I don't think it is a strict requirement for this pattern. The notion of variable generalization is already present in the fact that we don't describe one single item/copy even at the specific format level. That is, even a "manifestation" has the extent of a group, and thus we can relate that to a broader group representing the union of manifestations (i.e. the "expression" level), without needing to separate the classes. This notion seems very much present in the Product type as well, where it's up to the user of the vocabulary to determine the level of specificity for the subject described. Granted, there are additional specializations in IndividualProduct and ProductModel, but both derive from the general Product class. Thus, there is no principal divide. (In fact, if a case was made that there is, that would seem to be an argument for the applicability of WEMI in schema.org..) (The choice of which other properties (e.g. author, illustrator, subjects, publisher) should be specific to a certain format/representation is then up to the data publisher (cataloguer and/or library system) to determine. You might do selective copying of properties, or link to a prototypical uniform work. Consumers/services wishing to index data for each specific format in their entirety can also copy in the general properties from the general form. Others may choose to traverse graphs, or even create unions of related formats altogether into a mixed form described with all properties.) Cheers, Niklas [1]: http://lists.w3.org/Archives/Public/public-schemabibex/2013Mar/0077.html [2]: http://grammar.ccc.commnet.edu/grammar/composition/abstract.htm [3]: http://purl.org/dc/terms/ [4]: http://purl.org/dc/terms/isFormatOf [5]: http://www.legislation.gov.uk/developer/formats/rdf [6]: http://www.w3.org/wiki/WebSchemas/Datasets -- Niklas Lindström National Library of Sweden (KB)
Received on Tuesday, 2 July 2013 21:55:07 UTC