Re: Using dc:isFormatOf to unify instanceOf and commonEndeavor with existing practise from Niklas Lindström on 2013-07-03 (public-schemabibex@w3.org from July 2013)

From: Niklas Lindström <lindstream@gmail.com>
Date: Wed, 3 Jul 2013 23:33:15 +0200
To: Antoine Isaac <aisaac@few.vu.nl>
Cc: "public-schemabibex@w3.org" <public-schemabibex@w3.org>
Message-ID: <CADjV5jdPbHQSd0AWfGKU0VmktZ1SL-4AQV4dAgkB2f6NcOoZeQ@mail.gmail.com>
Hi Antoine,

On Wed, Jul 3, 2013 at 9:44 PM, Antoine Isaac <aisaac@few.vu.nl> wrote:

> Hi Niklas,
>
> On 7/3/13 8:47 PM, Niklas Lindström wrote:> Finally, I believe that many
> participants in the "DCMI Schema.org Alignment Task Group" [7] are also
> members of this group, correct? From my point of view, we should be doing
> the same thing, more or less. If this other group is still active, I wonder
> if we could merge our efforts?
>
>>
>>
> (disclaimer: I have been / am a member of the Task Group)
>
> Please no. Or at least not in this configuration. This task group ran
> quite out of steam because it consists of people that are busy. It is not
> by inviting them in discussions on the alignment between DC and FBRR that
> we will help it!
>
> Two other non-organizational reasons:
> - the existing alignment almost got it all. We've mapped our Europeana
> Data Model (which relies a lot on DC) to schema.org, and we haven't found
> many (if any) DC properties that were left unmapped by [7].
> - introducing properties into schema.org is largely a class-based
> endeavour, because of the class-centric approach of schema.org. Though
> useful, an initiative like [7] is likely to percolate into DC when other
> initiatives propose to include in schema.org classes that 'use' DC
> properties for the description of their instance.
>
> That being said, if in the course of our efforts, we find mappings between
> that were left out by [7], it is of course relevant to flag them there...
>

Not to worry (I suspected that the group had lost steam). I was thinking
more along the lines of this group having more momentum to carry the
remnants of DC mapping work forward. For one, has these definitions been
included in either the DC terms vocabulary or the schema.org vocabulary (as
RDF/OWL statements, possibly linked to via an rdfs:seeAlso)? That would
make it definitive that the alignment has been approved by either
maintainer. (Compare this with GoodRelations, where there are at least
partial mappings in place.)

Furthermore, as said I don't expect us to propose anything near a strict
WEMI abstraction. But I claim that there are pressing use cases that can be
addressed with plain DC terms, and which I implore that we seriously
consider. Instead of inventing new stuff; or for that matter, waiting for
BIBFRAME to stabilize a new kind of WI abstraction, which is already under
dispute.

I believe that we should ensure that any new properties introduced relating
to the field of bibliographic descriptions are, if applicable, derived from
the existing DC terms set. Now, if there are other members in the
DCMI/Schema.org Alignment TG not present here which might find that
valuable, I think we should make it clear that we could do this. That is,
if we ourselves think it is valuable. You know I do. ;)

Cheers,
Niklas



> Best,
>
> Antoine
>
>
>
>  Hi Karen,
>>
>> Thanks! Yes, that is an important point – 'commonEndeavor' is less
>> specific than 'isFormatOf'. However, Dublin Core also defines
>> 'dcterms:isVersionOf' [1], as:
>>
>>      A related resource of which the described resource is a version,
>> edition, or adaptation.
>>
>>      Changes in version imply substantive changes in content rather than
>> differences in format.
>>
>> Combined, I believe they can cover the gamut of use cases. And the good
>> thing with that is that this specificity, which is already well-known
>> thanks to the widespread use of DC, makes the resulting descriptions more
>> actionable, in my opinion. Thinking of queries, and user interfaces
>> thereof, it seems to me that it is rather common to differ between either
>> various versions of a work, or various formats (representations) thereof.
>> (Of course, provided that the statement about what constitutes a version or
>> format is made by an informed party with knowledge about the work in
>> question.)
>>
>> So to cover (more of) what we're after, the 'isFormatOf' suggestion
>> should be combined with 'isVersionOf', to enable differentiation on these
>> aspects.
>>
>> As for precedence, 'isVersionOf' does imply that. Not so much with
>> 'isFormatOf', although it might be more apt for relating something specific
>> to something general. It is not required though, as long as the substantial
>> content is the same. I don't know enough about use cases where a generic
>> relation between things that are based on a common endeavor is suitable, so
>> I cannot readily speak for the aptness of these properties there. Relating
>> a book and a movie in general, without stating that one is derived from the
>> other (which I'd gather is the common case), seems to imply that they are
>> about the same topic. For relating to what they are about, we have
>> 'schema:about' (with say a link to a DBPedia resource for the topic at
>> hand). For stating that a movie is an adaptation of a book, I believe
>> 'dcterms:source' [2] is applicable. (More so than 'isVersionOf' since the
>> change is substantive in format as well. In any case, I do not think they
>> represent the same creative work, and
>> therefore do not constitute alternate formats of each other.)
>>
>> As for relating e.g. a movie and a book together without specifying that
>> relation further, 'dcterms:relation' [3] might be fine, since that is very
>> general indeed. Both 'dcterms:isFormatOf' and 'dcterms:isVersionOf' are
>> subproperties of that one. If you want to group a bunch of related works
>> together, that would apply. Another interesting DC term is
>> 'dcterms:references' [4].
>>
>> (Regarding 'dcterms:source', if that were to be incorporated into
>> schema.org <http://schema.org>, it'd have to be renamed, since that
>> one's already defined in [5] – as "The anatomical or organ system that the
>> artery originates from", of all things.. I'd suggest 'isBasedOn', as a
>> corresponding ObjectProperty to the already defined 'schema:isBasedOnUrl',
>> which is rather poor..)
>>
>>
>> As mentioned, I think we should seriously consider all of these, since
>> Dublin Core has been successfully deployed for many years. And even if we
>> end up with modified versions of them, and/or something in between (such as
>> 'commonEndeavor'), I think it is very valuable to be able to state, as
>> succinctly as possible, how our resulting proposals relate to the DC terms.
>> (And by succinctly, I personally mean using RDFS/OWL.)
>>
>> On that topic, a related, interesting read is the "Dublin Core to PROV
>> Mapping" WG Note [6], which explains in-depth correlations between the W3C
>> PROV (provenance) vocabulary and the above mentioned DC terms (among
>> others).
>>
>> Finally, I believe that many participants in the "DCMI Schema.org
>> Alignment Task Group" [7] are also members of this group, correct? From my
>> point of view, we should be doing the same thing, more or less. If this
>> other group is still active, I wonder if we could merge our efforts?
>>
>> Cheers,
>> Niklas
>>
>> [1]: http://purl.org/dc/terms/**isVersionOf<http://purl.org/dc/terms/isVersionOf>
>> [2]: http://purl.org/dc/terms/**source <http://purl.org/dc/terms/source>
>> [3]: http://purl.org/dc/terms/**relation<http://purl.org/dc/terms/relation>
>> [4]: http://purl.org/dc/terms/**references<http://purl.org/dc/terms/references>
>> [5]: http://schema.org/Artery
>> [6]: http://www.w3.org/TR/prov-dc/
>> [7]: http://wiki.dublincore.org/**index.php/Schema.org_Alignment<http://wiki.dublincore.org/index.php/Schema.org_Alignment>
>>
>>
>>
>> On Wed, Jul 3, 2013 at 12:36 AM, Karen Coyle <kcoyle@kcoyle.net <mailto:
>> kcoyle@kcoyle.net>> wrote:
>>
>>     Niklas, while your proposal is very well thought-out, let me clarify
>> that "commonEndeavor" is not the same as dc:isFormatOf. commonEndeavor can
>> be between things of the same format, such as the same text published at
>> different times (e.g. the many re-publications of various classics).
>> Another possibility is to use it to link translations to each other (even
>> though none are the original). It can ALSO refer to the same story or theme
>> in different formats. In other words, commonEndeavor is broader than
>> dc:isFormatOf, and can be used for anything where you think the
>> intellectual font was the same.
>>
>>     What I like about commonEndeavor is that it does not assume a
>> precedence. I realize that in fact saying that MovieA dc:isFormatOf BookB
>> does not state that the book can first, but I think it is easy to interpret
>> it that way. commonEndeavor is designed to be a horizontal relationship
>> that says, in effect: the two of these have significant intellectual
>> content in common.
>>
>>     None of this is to say that we should NOT also consider dc:isFormatOf
>> if we consider it useful. It's just that it has a different meaning to
>> commonEndeavor.
>>
>>     I agree with your analysis that it is best to avoid the necessity of
>> an abstract class. I think that it is better to create relationships
>> between things that people generally recognize as "things" than to try to
>> decide what is abstract and what is concrete. Both commonEndeavor and
>> dc:isFormatOf avoid the need for an abstract class.
>>
>>     kc
>>
>>
>>     On 7/2/13 2:54 PM, Niklas Lindström wrote:
>>
>>         Hi all,
>>
>>         I'm sure we can find a way of reconciling our current, varied,
>> positions
>>         regarding instanceOf and commonEndeavour. The differences seem to
>> relate
>>         to various MARC and FRBR experiences, or at least reactions to the
>>         different rigors, or indeed lack thereof, that have come out of
>> these. I
>>         also think we can avoid inventing too much new thinking on these
>>         matters, since there is lots to reuse from the domain of resource
>>         descriptions.
>>
>>         Model-wise, I think many of us (perhaps all) agree that the
>> restrictive
>>         abstraction principles of WEMI separation, and the simpler but
>> still
>>         divided Work/Instance counterpart evolving in BIBFRAME, are too
>> rigid
>>         for schema.org <http://schema.org> <http://schema.org/> (and
>> possibly for cataloguing in
>>
>>
>>         general). However, many also see a great benefit in relating to
>> common
>>         generalized notions of works from their specific forms, at least
>> when a
>>         work is available in a multitude of formats. This notion is very
>> useful
>>         both for facilitating cataloguers' workflow and for building
>> services
>>         upon the data. As I mentioned a while ago [1], this is not so
>> much about
>>         *abstraction* but *generalization* (also elaborated on in [2]).
>> So let
>>         us neither prescribe nor prohibit.
>>
>>         We should carefully consider what has been done in the wild, and
>>         especially practices stemming from, but not limited to,
>> libraries. A
>>         good example of this is the Dublin Core Terms vocabulary [3],
>> which has
>>         been heavily used for many years now, in lots of linked data
>> scenarios.
>>         It is used in and recommended by many W3C specs (e..g SKOS, VoID
>> and
>>         PROV) and is the base for many community vocabularies, such as
>> BIBO. Its
>>         terms are used in lots, if not most, of the datasets in the
>> Linked Open
>>         Data cloud. If there is any stable core in the plethora of
>> bibliographic
>>         vocabularies, I'd say DC terms is it. And it gets by with
>> (probably
>>         because of) quite a minimal specification (just like schema.org <
>> http://schema.org>
>>         <http://schema.org>).
>>
>>
>>         I therefore suggest that we consider 'isFormatOf', explicitly
>> based on
>>         'dcterms:isFormatOf' [4], as a replacement for, or indeed a
>> unification
>>         of, both the 'instanceOf' and 'commonEndeavour' proposals (and
>> possibly
>>         content/carrier).
>>
>>         Dublin Core defines 'isFormatOf' as:
>>
>>               A related resource that is substantially the same as the
>> described
>>         resource, but in another format.
>>
>>         The things being in specific formats are representations,
>> manifestations
>>         or instances. This property can be used both to relate between
>> different
>>         formats (similar to 'commonEndeavour'), and for linking from a
>> specific
>>         format to a generalized notion, such as an expression or work
>> (similar
>>         to 'instanceOf'). The latter use of 'dcterms:isFormatOf' is quite
>>         common, using the pattern of linking different representations
>> (e.g. in
>>         HTML or PDF for digital representations, or hardcover or
>> paperback for
>>         physical books) to a general resource which they represent.
>> Examples of
>>         this can be seen e.g. in legislation.gov.uk <
>> http://legislation.gov.uk> <http://legislation.gov.uk>
>>
>>
>>         [5]. (For specifying the kind of format, schema:bookFormat is
>>         applicable, as is the more general dcterms:format property).
>>
>>         As for the actual name, we could include 'isFormatOf' as is (and
>>         possibly its inverse, 'hasFormat'). Or we could relabel it
>> somehow. The
>>         name is important, but only instrumentally so. The most important
>> thing
>>         is to find a common meaning, and to do so we should base it on
>> existing
>>         usage.
>>
>>         (I'd also like to note that the solid proposal we do have on the
>> table,
>>         'hasPart'/'isPartOf', correlates very much to the existing Dublin
>> Core
>>         properties of the same name (as has also been discussed). I do
>> think we
>>         should mention that in the wiki page. I can address that unless
>> anyone
>>         objects, following the pattern of the Datasets proposal [6]. In
>> fact, if
>>         we can find a common ground in (at least parts of) the Dublin Core
>>         terms, we can also continue to import some other terms, such as
>>         'isVersionOf', 'references' and 'source', if needed.)
>>
>>         Regarding the necessity of an abstract class, I don't think it is
>> a
>>         strict requirement for this pattern. The notion of variable
>>         generalization is already present in the fact that we don't
>> describe one
>>         single item/copy even at the specific format level. That is, even
>> a
>>         "manifestation" has the extent of a group, and thus we can relate
>> that
>>         to a broader group representing the union of manifestations (i.e.
>> the
>>         "expression" level), without needing to separate the classes. This
>>         notion seems very much present in the Product type as well, where
>> it's
>>         up to the user of the vocabulary to determine the level of
>> specificity
>>         for the subject described. Granted, there are additional
>> specializations
>>         in IndividualProduct and ProductModel, but both derive from the
>> general
>>         Product class. Thus, there is no principal divide. (In fact, if a
>> case
>>         was made that there is, that would seem to be an argument for the
>>         applicability of WEMI in schema.org..)
>>
>>         (The choice of which other properties (e.g. author, illustrator,
>>         subjects, publisher) should be specific to a certain
>>         format/representation is then up to the data publisher (cataloguer
>>         and/or library system) to determine. You might do selective
>> copying of
>>         properties, or link to a prototypical uniform work.
>> Consumers/services
>>         wishing to index data for each specific format in their entirety
>> can
>>         also copy in the general properties from the general form. Others
>> may
>>         choose to traverse graphs, or even create unions of related
>> formats
>>         altogether into a mixed form described with all properties.)
>>
>>         Cheers,
>>         Niklas
>>
>>         [1]:
>>         http://lists.w3.org/Archives/_**_Public/public-schemabibex/__**
>> 2013Mar/0077.html<http://lists.w3.org/Archives/__Public/public-schemabibex/__2013Mar/0077.html><
>> http://lists.w3.org/Archives/**Public/public-schemabibex/**
>> 2013Mar/0077.html<http://lists.w3.org/Archives/Public/public-schemabibex/2013Mar/0077.html>
>> >
>>         [2]: http://grammar.ccc.commnet.__**edu/grammar/composition/__**abstract.htm
>> <http://grammar.ccc.commnet.**edu/grammar/composition/**abstract.htm<http://grammar.ccc.commnet.edu/grammar/composition/abstract.htm>
>> >
>>         [3]: http://purl.org/dc/terms/
>>         [4]: http://purl.org/dc/terms/__**isFormatOf<http://purl.org/dc/terms/__isFormatOf><
>> http://purl.org/dc/terms/**isFormatOf<http://purl.org/dc/terms/isFormatOf>
>> >
>>         [5]: http://www.legislation.gov.uk/**__developer/formats/rdf<http://www.legislation.gov.uk/__developer/formats/rdf><
>> http://www.legislation.gov.**uk/developer/formats/rdf<http://www.legislation.gov.uk/developer/formats/rdf>
>> >
>>         [6]: http://www.w3.org/wiki/__**WebSchemas/Datasets<http://www.w3.org/wiki/__WebSchemas/Datasets><
>> http://www.w3.org/wiki/**WebSchemas/Datasets<http://www.w3.org/wiki/WebSchemas/Datasets>
>> >
>>
>>
>>         --
>>         Niklas Lindström
>>         National Library of Sweden (KB)
>>
>>
>>     --
>>     Karen Coyle
>>     kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> http://kcoyle.net
>>     ph: 1-510-540-7596 <tel:1-510-540-7596>
>>     m: 1-510-435-8234 <tel:1-510-435-8234>
>>     skype: kcoylenet
>>
>>
>>
>
>
Received on Wednesday, 3 July 2013 21:34:14 UTC