W3C home > Mailing lists > Public > public-lld@w3.org > March 2011

RE: Question about MARCXML to Models transformation

From: Tillett, Barbara <btil@loc.gov>
Date: Sun, 6 Mar 2011 15:44:01 -0500
To: "Young,Jeff (OR)" <jyoung@oclc.org>, Karen Coyle <kcoyle@kcoyle.net>, Thomas Baker <tbaker@tbaker.de>
CC: "gordon@gordondunsire.com" <gordon@gordondunsire.com>, "public-lld@w3.org" <public-lld@w3.org>
Message-ID: <1D525027B29706438707F336D75A279F168103727E@LCXCLMB03.LCDS.LOC.GOV>
I basically agree, but want to point out that FRBR's WEMI are not strictly hierarchical but rather a network graph (don't forget about the many to many relationships for the WEMI - it's not just one to one or one to many or many to one - there are also many to many).  

Also "relational database" does not mean it has relationships...it means it's based on relational algebra with joins, unions, intersections, etc., of tables (sets of data).  I'm really looking forward to breaking away from relational database models to get to something that handles the complex graph structures of the bibliographic universe better.  It's probably because I'm rather fond of topological spaces and non-Euclidean geometries and see a better fit in that realm, but computer science isn't there yet.  I think the Semantic Web has the potential to free us from the relational model, while improving connections and links of relationships...but I still see current iterations as not really "there" yet.  Gordon's work is a brilliant step to demonstrating and documenting the logic relations (transitive, equivalent, etc.), cardinalities, etc.  It really helps us "see" the model and note where adjustments would make it even better.

FRBR has declared certain attributes for the entities, and I completely agree some of those could better evolve into relationships (like corporate bodies with a relationship/role of "is publisher" to a particular manifestation rather than leaving them as attributes of a manifestation) - we started to do that with RDA, but stopped short as being too drastic a change from FRBR for this first round...but I am sure it will be revisited once we have more registries like VIAF and the RDA registries that make linking and declaration of relationships easier and more stable, and schemas and systems that can actually do something with such structures. - Barbara
________________________________________
From: public-lld-request@w3.org [public-lld-request@w3.org] On Behalf Of Young,Jeff (OR) [jyoung@oclc.org]
Sent: Sunday, March 06, 2011 4:15 AM
To: Karen Coyle; Thomas Baker
Cc: gordon@gordondunsire.com; public-lld@w3.org
Subject: RE: Question about MARCXML to Models transformation

I think Karen brings some nebulous issues into focus. Sorry if my
thoughts are cryptic. I can try to clarify them if needed.

> It's rather clear that FRBR was not designed with the open world model
> in mind -- in fact, it was designed around a late 90's concept of
> relational databases.

The Semantic Web is also "relational", so that aspect doesn't bother me.
I agree that "relational databases" impose closed world assumptions, but
I'm not sure this limitation affects how designers go about their
modeling. For example, reusable OWL can be rationalized from legacy
relational databases using D2RQ:

http://www4.wiwiss.fu-berlin.de/bizer/d2rq/spec/

> It is very top-down in that XML-ish way and most
> commonly it is assumed that each of the FRBR entities will be a
> record.

FRBR in general is relational, but the WEMI classes specifically are
unquestionably hierarchical. I would agree that XML Schemas warps our
thinking, but WEMI is starting to make sense to me as a hierarchy. My
complaint now is the lack of meaningful WEMI subclasses that could make
the model much easier to understand and deal with.

> I say that latter because of the fact that the WEMI entities,
> while having inter-dependencies, also have specific relationships to
> other WEMI entities (as well as to the group 2 and 3 entities). So an
> expression will have a relationship to a work and to one or more
> manifestations -- that's what I think of as a *structural*
> relationship --

I agree with this interpretation and provide these RDF examples for
illustration.
(Beware: my "frbr" namespace elements are ad hoc.)

<expression-1> a frbr:Expression ;
        frbr:isARealizationOf <work-1> ;
        frbr:isEmbodiedIn <manifestation-1> ;
        frbr:isEmbodiedIn <manifestation-2> .
<work-1> a frbr:Work .
<manifestation-1> a frbr:Manifestation .
<manifestation-2> a frbr:Manifestation .

> but it can also have bibliographic relationships to
> other expressions (like: one expression is the translation of another
> expression, or is an updated edition).

Here's what the additional triples would look like:

<expression-1>
        frbr:hasATranslation <expression-2> ;
        frbr:hasARevision <expression-3> .
<expression-2> a frbr:Expression .
<expression-3> a frbr:Expression .

> The fact is that it will be very hard to have an expression without a
> work because of the way the properties are spread across the Group 1
> entities: an expression does not have relationship to a primary
> creator (e.g. author), only a work does. Ditto subjects: only Work
> entities have the "has subject" property that links to topical
> entities.

I'm willing to go so far as believing it is *impossible* to have an
Expression without a Work because *all* conceivable Expressions have
creator and subject relationships in theory: even the fictional ones. I
think we need to beware that FRBR doesn't strive to be a metadata
exchange format, it strives to be a model of common sense reality (more
or less).

> A Manifestation doesn't have a language of text; that
> belongs to the Expression. The necessary elements to describe a
> resource

Riddle: When is a resource not a resource?
Answer: When the modeler(s) declare it to be a property or set of
properties instead.

Fortunately, no modeler in history ever had the last word. :-)

> are spread across the 3 (WEM) group 1 entities, making it
> very difficult to treat them separately. To give you an idea of what
> each entity "means", here are some key attributes for each:
>
> Work
>   - work title
>   - key for a musical work
>   - coordinates for a cartographic work
>   - with relationships to
>      -- creator of the work
>      -- topics of the work (subject headings and classifications)

The terms "musical work", "cartographic work", and various other
rationalized "foo work" qualifiers imply subclasses of FRBR Work. I
think it's worth attempting.

>
> Expression
>   - language of the expression (if text)
>   - form of the expression (text, sound, image)

Likewise, "text expression", "sound expression", "image expression", and
other qualifications all imply subclasses of FRBR Expression.

> Manifestation
>   - title of the manifestation (may be different to the work title)
>   - edition
>   - publisher, date of publication
>   - physical format (size, units, other measurements)
>   - ISBN, ISSN, etc.

My feeling is that some of these "attributes" (owl:DatatypeProperty)
SHOULD be modeled as relationships/associations instead
(owl:ObjectProperty). For example, I think "publishers" should be
modeled as a frbr:CorporateBody (or a subclass thereof) and "place of
publication" should be modeled as frbr:Place. Limiting the individuals
in the CorporateBody and Place classes to known subjects of a Work
doesn't make sense in an open world model. Most real world objects can
be dumbed-down to literals when necessary.

>
> There are many more attributes, but these are the common ones and the
> ones that I think may help people understand the issue. The data
> record that libraries create today contains data elements from all of
> these entities, mixed together and usually not clearly identified as W
> or E or M. To create library data under FRBR it will be necessary to
> ALWAYS have Work+Expression+Manifestation entities. (I'm skipping Item
> in the interest of brevity, but we should assume that it is part of
> the picture.)

For better or worse it's not that simple. As Tom Baker pointed out in
another thread, ontologies aren't exchange formats, they are models in
which some entities can be inferred.

>
> Now, it would be great to investigate the inferences that one can make
> with FRBR. For example, if you say:
>
> resourceA / frbrer:hasSubject /
> http://id.loc.gov/authorities/sh85148177
>
> then the inference is that resourceA is a Work. (I believe the way to
> say this is that "hasSubject" has the domain "Work". Right, Gordon?)

FRBRer coins separate "has as subject" properties for each range class,
but as you would expect the domain is always Work.

> You cannot then say:
>
> resourceA / frbrer:hasPublisher / "Random House"
>
> because *that* statement would mean that resourceA is a Manifestation,
> and Manifestation and Work are disjoint.

The FRBRer OWL doesn't currently declare Work and Expression to be
owl:disjointWith one another, but I think that was Gordon's plan. Here's
some support for your understanding:
http://www.w3.org/TR/owl2-primer/#Class_Disjointness.

> So in a sense you are forced
> (whether OWL forces you or not is another question), but the FRBR
> logic forces you to create a new entity for the Manifestation
> *portion* of your description. In addition, to connect the
> Manifestation to the Work (since you need the creator and subjects to
> complete your description), you may need to create an entity for the
> Expression. (RDA allows Manifestations to "Manifest" Works, but I
> think FRBR in its present state still requires M -> E -> W.)

I believe it's possible to create an inferred shortcut like this in OWL,
but it's just a convenience property.

>
> This is, of course, unless I have totally missed something in the
> nature of FRBR, and if so I would love to hear that my worst fears
> about it do not come to bear.

I think you've created a useful and accurate summary. :-)

Jeff

>
> kc
>
> >
> > It relates to Dan's point that schema designers in the new
> > idiom are not actually issuing "shipping orders" for data
> > integrity in the imperative style to which they are accustomed
> > -- even if, as I suspect, they may sometimes _believe_ that
> > this is is the effect of declarations such as the above.
> >
> > As Jeff has pointed out, one might conceivably use the OWL to
> > construct syntactic validators to impose such data integrity,
> > but these are necessarily over and above whatever the OWL
> > itself actually says.
> >
> > Tom
> >
> >
> >
>
>
>
> --
> Karen Coyle
> kcoyle@kcoyle.net http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet
>
>
Received on Sunday, 6 March 2011 20:44:46 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 6 March 2011 20:44:46 GMT