Re: Non- and Partial-FRBR Metadata from William Waites on 2010-09-21 (public-lld@w3.org from September 2010)

From: William Waites <william.waites@okfn.org>
Date: Tue, 21 Sep 2010 21:47:04 +0100
To: Karen Coyle <kcoyle@kcoyle.net>
CC: Asaf Bartov <asaf.bartov@gmail.com>, public-lld@w3.org
Message-ID: <4C9919C8.6070203@okfn.org>
On 10-09-21 16:36, Karen Coyle wrote:
>
> I confess here that I am a FRBR skeptic, at least as far as the Group 1
> WEMI structure. I do believe that WEMI can be useful as guidance for
> catalogers in the decisions that they must make.

For my part, after having a few discussions and thinking it through
a bit, I'm not against WEMI. Where my problem lay (still lies) is
not in the structure but in the rules about when to create a new
Work. To me, for example, translation -> new work *not* just new
expression.

> This goes way back in our discussion to the remarks by Jon [1] and by
> Dan [2] suggesting that Class affiliation of properties, rather than
> record structure, would be a better way to treat WEMI.

So suppose you had,

:foo a :BiblioThing ;
    :author :somebody ;
    :title "lalala"
    :language "sw"
    :isbn "1234567890123"
    :shelf 3.

In order to later create a more structured view, wouldn't you have
inference rules that said something like,

{ ?x a :BiblioThing .
  ?x ?p ?o .
  ?p subPropertyOf :WorkishProperty } =>
{ _:work a :Work . _:work ?p ?o } .

except that the inference rules would be hard to write properly,
let alone have fall out of a description logic because nailing
down scope of the existential variables is tricky, especially
when they might span different rules - i.e. how would you infer
links between Work, Expression, etc? It might be that to do this
we would have to stray beyond things that are expressible in DL
or N3 to explain how this operation is to be accomplished.

> The fact that we
> are trying to find work-arounds to WEMI is evidence that the creation of
> four separate entities may not be viable in practice, at least not today
> when most of our bibliographic data has been created in a pre-FRBR world.

I like the proposal earlier in this thread to represent the data
in WEMI structure but to use blank nodes to avoid having to worry
too much in advance about URIs (that will almost always have to be
deduped anyways, so why not dedup the blank nodes?) and let a
standard entailment supply the missing bits.

> My situation would require a broadly defined "bibliographic description"
> entity. DC has "citation" but I think that has a different meaning. A
> bibliographic description of the type done by libraries and even
> bookstores has many properties not included in citations. I would like
> to have an entity that could be applied to MARC records, ONIX records,
> Amazon entries, etc.

What I did with bibliographica was make an entity, called MarcRecord
that had all the fields that a MARC record might have. Then run a
process on it (described as an opmv:Process) that generated a WEMI
structure (actually I left out the E). The process is a bit
idiosyncratic but I can imagine an analogous one for ONIX records,
Amazon entries, OpenLibrary entries, etc. The process is describable
in Python but not in DL or N3 (at least I don't think it is). An
important aspect of this is that it keeps the provenance information
clear and you can use any intermediate stage for further processing.

So building blocks,

MARC21 Record -> MarcRecord as RDF (transliteration) -> W(E)MI Thing

The first "->" is easy to specify (relatively) and could be the
subject of some LLD vocabulary and guidance, and likewise for
other source formats.

The second "->" is much harder, more controversial, subject to
choices and cataloguing rules.

If you were to have an intermediate step,

MarcRecord as RDF -> GenericFlatBiblioRecord -> W(E)MI Thing

I think this is more or less what you are suggesting. In this case,
the first "->" is probably pretty easy. The second "->" is still
hard.

I don't think we can skip the "MarcRecord as RDF" step without
destroying provenance information.

> class affiliation rather than actual structure does not mean that
> applications could not take advantage of efficiencies such as allowing
> catalogers to copy Work or Expression information from other
> bibliographic descriptions to a new bibliographic entry. The proof of
> this is that systems (WorldCat; Open Library) have been able to create a
> Work "view" while maintaining the traditional bibliographic records in
> their databases. I can imagine WEMI being abstracted from complete or
> incomplete bibliographic descriptions and used as linked data. I am less
> able to imagine WEMI as our data structure for library and other
> bibliographic systems, at least at this moment in time.

I think the "class afiliation" is the second "->". I fear
it will be very hard to pick apart exactly what this operation
does without going into things that RDF (i.e. FOPL, DL, N3,
stratified datalog) cannot express. WorldCat, OpenLibrary,
have custom code (not written in RDF!) that does this. Maybe
that is ok, but if that is our conclusion we should be clear
about it.

Cheers,
-w

-- 
William Waites           <william.waites@okfn.org>
Mob: +44 789 798 9965    Open Knowledge Foundation
Fax: +44 131 464 4948                Edinburgh, UK

RDF Indexing, Clustering and Inferencing in Python
		http://ordf.org/
Received on Tuesday, 21 September 2010 20:48:46 UTC