Re: Works and instances from Richard Wallis on 2013-01-11 (public-schemabibex@w3.org from January 2013)

From: Richard Wallis <richard.wallis@oclc.org>
Date: Fri, 11 Jan 2013 16:24:45 +0000
To: Niklas Lindström <lindstream@gmail.com>, Sean Fraser <sean@theatre-optique.com>
CC: Karen Coyle <kcoyle@kcoyle.net>, "public-schemabibex@w3.org" <public-schemabibex@w3.org>
Message-ID: <CD15ED4D.4C19%richard.wallis@oclc.org>
Hi Niklas,

I see the mapping exercise as a useful one to guide implementers as to where
they should pull data, currently in Dublin Core, Marc, RDA, ONIYX, etc. to
populate a schema.org description of their resources.  I welcome your offer
to work on mapping

At a high-level this will be comparatively simple - you would populate
schema:name from dc:title for instance.  But other areas may not have such
one-to-one relationships.  For example schema:about could be populated from
many places in Marc.

You reference the extension capabilities that come from RDFa1.1 - we must
remember that the objective of this group is to recommend extensions to the
schema.org vocabulary itself - 'a set of types & properties that the search
engines will commit to recognising over time'.  Any extension/expansion that
takes advantage of features in RDFa1.1, but that are not included in the
Schema.org vocabulary, will be invisible to the search engines and others
that follow their vocabulary definition.

Schema, even when extended by our proposals, will not be detailed enough to
replace the full scope of current bib standards, or emerging ones such as
BIBFRAME.  

Its role is to supplement those in a way that we can share our high-level
detail with the wider world to make it more discoverable.

I would fully expect, and recommend, that bibliographic metadata publishers
publish both schema and their preferred domain specific standard encoded
data mixed together.  In this way they will gain the dual benefits of
providing generic (understood by the search engines) discoverable metadata
from the same place as more detailed metadata for others in our domain to
consume.  It will be up to the individual consumers as to which and how much
the extract and use from what is published.

~Richard.



On 11/01/2013 15:44, "Niklas Lindström" <lindstream@gmail.com> wrote:

> Hi all,
> 
> On Fri, Jan 11, 2013 at 2:38 AM, Sean Fraser <sean@theatre-optique.com> wrote:
>> Richard,
>> 
>> Thank you for the explanation.
>> 
>> How will the proposed metadata map to Dublin Core (and, Facebook) data? or,
>> does it even need to map?
> 
> The mapping of these extensions to Dublin Core, and reasonably one or
> more of BIBO [1], the full set of RDA vocabularies [2] and/or the
> BibFrame work-in-progress [3] is something I see as very valuable;
> even crucial to interoperability. I would very much like to work on
> this, and I hope that this group see such mappings as an important
> auxiliary goal of ours.
> 
> Specifically, I'd like to highlight their common ground, rather than
> trying to make everything mappable. I know you are very aware of the
> conceptual differences in the various models of these initiatives
> (e.g. regarding distinct WEMI entities), but also that there are many
> overlaps. I believe these can and should be captured using RDFS/OWL,
> so that it is possible to view data expressed in either one of them,
> at least in part, from the perspective of the others. (There is also
> definitive potential in incorporating the outcome of this specific
> instance/commonEndeavor discussion not only in schema.org, but also in
> e.g. BIBO.)
> 
> This is about doing the same thing for vocabularies
> (schemas/ontologies) as Richard aptly described regarding things
> acting as hubs for related things. Well-linked vocabularies provide a
> means for understanding common aspects even about very specific data,
> as long as there is a super/sub-property relation and/or conceptual
> super/sub-sets of entities (defined by using one or more rdf:type
> relations to Classes representing such sets).
> 
> (While this meta-feature of RDF is sometimes construed as quite
> academic or limited to production/expert systems in enterprises, it
> doesn't have to be. It also scales down to simple vocabulary
> expansion, available for RDFa 1.1 [4]. This can be used e.g. for
> expanding imported data, and picking out the descriptions understood
> by local indexes/services.)
> 
> These explicit mappings will also define what parts of the existing
> bibliographical concepts and terminology serve as the foundation for,
> and limits of, schema-bibex. It should be straightforward how to add
> flavor and details (such as various library-specific identifiers,
> enumerations of subjects, genres, media formats and so on) by
> combining this limited common ground with the more organically grown
> classes and properties of the mapped vocabularies and taxonomies
> (which I am sure will continue to prosper alongside it). That is, we
> should strive for explicit coherence between the various vocabularies,
> to make their combination pretty much seamless.
> 
> Cheers,
> Niklas
> 
> [1]: http://bibliontology.com/
> [2]: http://rdvocab.info/
> [3]: http://www.loc.gov/marc/transition/news/bibframe-112312.html
> [4]: http://www.w3.org/TR/rdfa-core/#vocabulary-expansion
> 
> 
>> Sean
>> 
>> 
>> 
>> 
>> ----- Original Message -----
>> From:
>> "Richard Wallis" <richard.wallis@oclc.org>
>> 
>> To:
>> "Sean Fraser" <sean@theatre-optique.com>, "Karen Coyle" <kcoyle@kcoyle.net>,
>> <public-schemabibex@w3.org>
>> Cc:
>> 
>> Sent:
>> Thu, 10 Jan 2013 20:41:45 +0000
>> 
>> Subject:
>> Re: Works and instances
>> 
>> 
>> Re: Works and instancesSean,
>> 
>> On 10/01/2013 18:45, "Sean Fraser" <sean@theatre-optique.com> wrote:
>> 
>> Richard,
>> 
>> "I contend that the majority of people start their journeys in the search
>> engines at that work level before drilling in to find what they are looking
>> for in terms of format, availability, etc. By not linking/identifying our
>> bibliographic resources at that level we are failing to get those resources
>> under the noses of the people looking for them.    To that end, I believe it
>> is worth striving towards providing the metadata capability to describe that
>> relationship, if the [data] publisher is aware of it."
>> 
>> I mostly agree with the above paragraph but search engines may not need
>> "metadata capability": they parse all of the content on a page. For example,
>> most people will search "The Hunting of the Snark" simply to find a copy to
>> purchase or read online. A very few will want to find the First Edition,
>> Second printing. Or, a person may want a limited A-Z edition copy. Search
>> engines will display those particular copie from information presented on
>> the webpage. I don't know how beneficial "metadata" would be for these
>> works.
>> 
>> The purpose of Schema.org is to provide structured metadata so that the
>> search engines can identify the Œthings¹ that the content of the pages are
>> describing  as against having to mine the text on the page to infer what it
>> might be about.  (In reality they will be doing both).
>> 
>> However as the structured data will be associated with an identifier and
>> those data can contain links to other identifiers Google (and the others)
>> can start to build a knowledge graph of connected things.  Something very
>> difficult to by just trying to infer meaning from interconnected pages.  The
>> background to this can be found in this Google blog post:
>> <http://googleblog.blogspot.co.uk/2012/05/introducing-knowledge-graph-things-
>> not.html>
>> 
>> So where in this [bibliographic] model does the benefit of identifying works
>> come in....
>> 
>> When the metadata for many Books indicate that they are instances of a
>> CreativeWork by providing links to it, the search engine can infer that the
>> CreativeWork is a data-hub providing links to many, possibly differing in
>> format/location/availability, individual instances.  It highly likely that
>> the search engine will direct a user to that hub.
>> 
>> To see this effect, search (in an English language version of Google) for
>> Jupiter and take a look at the Knowledge Graph panel (top right) where
>> structured data is displayed.  The hub effect is demonstrated in two ways 
>> the Jupiter reference is acting as a hub for the list of moons  the link to
>> the Sun is a link to the hub for Jupiter and its sister planets. Don¹t be
>> misled by the ŒPeople also searched for¹ label  this panel is built from
>> structured data about things and relationships which as been created from
>> their Freebase product, supplemented by harvested Schema.org data, and tuned
>> by search log analysis.
>> 
>> OK, there is a difference between Work/Instance and planet/star
>> relationships, but the principle is still the same.
>> 
>> ~Richard.
>> 
>> 
>> 
>> 
>> 
>> Sean
>> 
>> 
>> 
>> ----- Original Message -----
>> From:
>> "Richard Wallis" <richard.wallis@oclc.org>
>> 
>> To:
>> "Karen Coyle" <kcoyle@kcoyle.net>, <public-schemabibex@w3.org>
>> Cc:
>> 
>> Sent:
>> Mon, 07 Jan 2013 11:42:46 +0000
>> Subject:
>> Re: Works and instances
>> 
>> 
>> Re: Works and instancesKaren,
>> 
>> I have much sympathy with your thoughts on the loose hooking together of
>> CreativeWorks with similar content  especially recognising that the
>> similarity is in the mind of those doing the hooking.  versionOf could be a
>> good replacement for instanceOf here.
>> 
>> Being immersed in the graph based world of linked data, I try to avoid (not
>> always with much success ;-) the use of structured hierarchical terms such
>> as vertical & horizontal, as they tend to precondition thinking.  versionOf
>> again may be useful here.
>> 
>> Having said all that, you only have to listen in on general conversation
>> between your colleagues, friends and relatives to realise that we all have
>> an implicit understanding of the concept of a creative work and [what us
>> frbr exposed folks would label] expressions and manifestations of that work
>> (without using those labels).
>> 
>> I contend that the majority of people start their journeys in the search
>> engines at that work level before drilling in to find what they are looking
>> for in terms of format, availability, etc. By not linking/identifying our
>> bibliographic resources at that level we are failing to get those resources
>> under the noses of the people looking for them.    To that end, I believe it
>> is worth striving towards providing the metadata capability to describe that
>> relationship, if the [data] publisher is aware of it.
>> 
>> ~Richard.
>> 
>> 
>> On 06/01/2013 22:53, "Karen Coyle" <kcoyle@kcoyle.net> wrote:
>> 
>> Richard, I would be more comfortable with a relationship property that
>> did not presume a hierarchy - that is, did not presume that one
>> description is subordinate ("instance of") another. My gut feeling is
>> that we will have many descriptions at various levels of detail, but
>> that there is no universal ordering that resembles WEMI.  Still, people
>> will want to say that *this* is similar to/another version of *that*. So
>> we'll have lots of citations of Tom Sawyer, most of which will include
>> publisher information, and people will want to hook them together. And
>> we might have movies and ebooks and audio books and various other things
>> that also have similar content. That "hooking together" constitutes a
>> Work in the minds of the "hookers" (:-)). But there may be no "Work"
>> description in the FRBR sense to point to. So I would prefer a
>> horizontal relationship property to a vertical one. Or, in fact, I would
>> prefer a property that allows people to make the relationship without
>> having to think any more about the relationship than "these are kind of
>> the same content."
>> 
>> And, no, I don't know what to call it. "versionOf" comes to mind, but is
>> not entirely satisfactory.
>> 
>> kc
>> 
>> On 1/6/13 1:35 PM, Richard Wallis wrote:
>>> Hi Karen,
>>> 
>>> The key points I pick out of your well reasoned email are that there is no
>>> accepted definition of "workness", yet [it] would make sense to many
>>> people.
>>> 
>>> Schema already includes a CreativeWork - it is an issue already being
>>> addressed by the wider community  If we (the community who have probably
>>> have spent more time, effort, scholarly article pages, and conference
>>> sessions on the topic, than any other) can not help improve the approach,
>>> we
>>> will be missing a massive opportunity.
>>> 
>>> Dare I suggest it would be too easy to over-think this, and put it onto
>>> the
>>> 'too difficult' pile.
>>> 
>>> Both Painting & Sculpture are sub-types of CreativeWork.
>>> 
>>> I agree that schema:manuscript is an omission and is something that should
>>> be discussed further (under the heading of content vs carrier ?).
>>> 
>>> Back to 'instanceOf' and 'instance', I am not totally happy that they are
>>> the best property names (too much baggage inherited from other
>>> disciplines),
>>> but I have failed to come up with anything better.
>>> 
>>> In my view schema:CreativeWork is aligned with frbr:Work as well as
>>> frbr:Expression, frbr:Manifestation, and probably frbr:Item - they all
>>> could
>>> be considered to be CreativeWork descriptions of more or less
>>> abstractness.
>>> If my assumption is a working one, an expression could be described (in
>>> Schema terms) as the instanceOf a Work as well as having an instance (the
>>> manifestation).
>>> 
>>> Sorry for my slightly rambling response - its a bit late in the evening
>>> here
>>> ;-)
>>> 
>>> ~Richard.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On 06/01/2013 20:08, "Karen Coyle" <kcoyle@kcoyle.net> wrote:
>>> 
>>>> I have been attempting for a while to respond to the definition of
>>>> properties relating works and instances. The problem may be that I have
>>>> been reading (too much?) about the work concept lately, and so I try to
>>>> cover too much ground
>>>> 
>>>> (Aside: recommended reading on the library concept of Work: Martha Yee's
>>>> four part series "What is a work?" [1] It is a relatively easy read,
>>>> there are examples, and the first part gives excellent historic
>>>> background.)
>>>> 
>>>> I will try to simplify with only a few comments:
>>>> 
>>>> 1) "instanceOf" between two schema:creativeWork descriptions would only
>>>> be meaningful under certain conditions (e.g. one describes a work in the
>>>> abstract only), conditions which I consider to be (at this point in
>>>> time) unlikely to occur. Point 2 is one of the reasons for this opinion.
>>>> 
>>>> 2) There is no accepted definition of "workness" even within the LAM
>>>> environment. cf. FRBRoo,[2] ISTC,[3] FaBIO, [4], not to mention BIBFRAME
>>>> [5], all of which differ from each other and from the description on
>>>> this group's wiki. (cf the example on the wiki, of 2 books and a movie,
>>>> is not aligned with FRBR:Work, but would make sense to many people).
>>>> 
>>>> 3) It isn't clear to me whether works will be things (with identifiers),
>>>> post-description clusters (with or without IDs. a la' VIAF), or
>>>> relationships between bibliographic descriptions (e.g. "sameWork"
>>>> between two schema:Book descriptions)
>>>> 
>>>> 4) The term "instance" for a mass-produced product is not helpful. It
>>>> could be applied to "singularities" like works of art, but not for
>>>> products. schema:creativeWork may describe both products and
>>>> singularities, without distinguishing which it is. Most schema:Book
>>>> descriptions will be manufactured products, but note that there is no
>>>> schema:manuscript. (schema:Painting and schema:Sculpture, which should
>>>> describe singularities, appear to be place-holders since they do not
>>>> extend schema:creativeWork.)
>>>> 
>>>> Beyond this, it gets even more complex, and I do not believe that we can
>>>> resolve this at this time. My recommendation is that it is premature to
>>>> introduce this concept into schema.org. There are other relationships,
>>>> in particular the part/whole relationship that Richard also has included
>>>> on the wiki, that are more useful. We should concentrate on those.
>>>> 
>>>> kc
>>>> 
>>>> 
>>>> [1] Linked from http://myee.bol.ucla.edu/workspub.htm
>>>> [2] http://www.cidoc-crm.org/frbr_inro.html
>>>> <http://www.cidoc-crm.org/frbr_inrohtml>
>>>> [3] http://www.istc-international.org/html/
>>>> [4] http://www.essepuntato.it/lode/http:/purl.org/spar/fabio
>>>> [5] http://www.loc.gov/marc/transition/news/bibframe-112312.html
>>> 
>>> 
>>> 
>> 
>> --
>> Karen Coyle
>> kcoyle@kcoyle.net http://kcoyle.net
>> ph: 1-510-540-7596
>> m: 1-510-435-8234
>> skype: kcoylenet
>> 
>> 
>> 
>
Received on Friday, 11 January 2013 16:25:50 UTC