Re: Question about MARCXML to Models transformation

Quoting Thomas Baker <tbaker@tbaker.de>:


> This brings us back to a point that arose early in our
> discussions and is perhaps worth capturing, if it is not
> already on our lists:  the difference between the closed-world
> assumptions underlying the creation of traditional library
> formats, which enforce integrity constraints on the data
> as it is expressed syntactically in record formats, versus
> RDF/OWL modeling such as above.


It's rather clear that FRBR was not designed with the open world model  
in mind -- in fact, it was designed around a late 90's concept of  
relational databases. It is very top-down in that XML-ish way and most  
commonly it is assumed that each of the FRBR entities will be a  
record. I say that latter because of the fact that the WEMI entities,  
while having inter-dependencies, also have specific relationships to  
other WEMI entities (as well as to the group 2 and 3 entities). So an  
expression will have a relationship to a work and to one or more  
manifestations -- that's what I think of as a *structural*  
relationship -- but it can also have bibliographic relationships to  
other expressions (like: one expression is the translation of another  
expression, or is an updated edition).

The fact is that it will be very hard to have an expression without a  
work because of the way the properties are spread across the Group 1  
entities: an expression does not have relationship to a primary  
creator (e.g. author), only a work does. Ditto subjects: only Work  
entities have the "has subject" property that links to topical  
entities. A Manifestation doesn't have a language of text; that  
belongs to the Expression. The necessary elements to describe a  
resource are spread across the 3 (WEM) group 1 entities, making it  
very difficult to treat them separately. To give you an idea of what  
each entity "means", here are some key attributes for each:

Work
  - work title
  - key for a musical work
  - coordinates for a cartographic work
  - with relationships to
     -- creator of the work
     -- topics of the work (subject headings and classifications)

Expression
  - language of the expression (if text)
  - form of the expression (text, sound, image)

Manifestation
  - title of the manifestation (may be different to the work title)
  - edition
  - publisher, date of publication
  - physical format (size, units, other measurements)
  - ISBN, ISSN, etc.

There are many more attributes, but these are the common ones and the  
ones that I think may help people understand the issue. The data  
record that libraries create today contains data elements from all of  
these entities, mixed together and usually not clearly identified as W  
or E or M. To create library data under FRBR it will be necessary to  
ALWAYS have Work+Expression+Manifestation entities. (I'm skipping Item  
in the interest of brevity, but we should assume that it is part of  
the picture.)

Now, it would be great to investigate the inferences that one can make  
with FRBR. For example, if you say:

resourceA / frbrer:hasSubject / http://id.loc.gov/authorities/sh85148177

then the inference is that resourceA is a Work. (I believe the way to  
say this is that "hasSubject" has the domain "Work". Right, Gordon?)  
You cannot then say:

resourceA / frbrer:hasPublisher / "Random House"

because *that* statement would mean that resourceA is a Manifestation,  
and Manifestation and Work are disjoint. So in a sense you are forced  
(whether OWL forces you or not is another question), but the FRBR  
logic forces you to create a new entity for the Manifestation  
*portion* of your description. In addition, to connect the  
Manifestation to the Work (since you need the creator and subjects to  
complete your description), you may need to create an entity for the  
Expression. (RDA allows Manifestations to "Manifest" Works, but I  
think FRBR in its present state still requires M -> E -> W.)

This is, of course, unless I have totally missed something in the  
nature of FRBR, and if so I would love to hear that my worst fears  
about it do not come to bear.

kc

>
> It relates to Dan's point that schema designers in the new
> idiom are not actually issuing "shipping orders" for data
> integrity in the imperative style to which they are accustomed
> -- even if, as I suspect, they may sometimes _believe_ that
> this is is the effect of declarations such as the above.
>
> As Jeff has pointed out, one might conceivably use the OWL to
> construct syntactic validators to impose such data integrity,
> but these are necessarily over and above whatever the OWL
> itself actually says.
>
> Tom
>
>
>



-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet

Received on Saturday, 5 March 2011 20:28:24 UTC