- From: Karen Coyle <kcoyle@kcoyle.net>
- Date: Tue, 21 Sep 2010 17:15:50 -0700
- To: William Waites <ww-keyword-okfn.193365@styx.org>, William Waites <william.waites@okfn.org>
- Cc: Asaf Bartov <asaf.bartov@gmail.com>, public-lld@w3.org
Quoting William Waites <william.waites@okfn.org>:
>
> For my part, after having a few discussions and thinking it through
> a bit, I'm not against WEMI. Where my problem lay (still lies) is
> not in the structure but in the rules about when to create a new
> Work. To me, for example, translation -> new work *not* just new
> expression.
You would need to explain WHY that is the case; in other words, how
does your definition of "work" result in translations being a new
work? It doesn't matter if you call it "work" or "x" -- you need a
coherent definition. FRBR's Work is pretty clearly defined for some
formats (monographic texts, for example), and what you have here
("William's work") is different in definition from the FRBR Work. So
define your rules for William's work, but I'm afraid you cannot
redefine FRBR Work -- that's someone else's domain.
> So suppose you had,
>
> :foo a :BiblioThing ;
> :author :somebody ;
> :title "lalala"
> :language "sw"
> :isbn "1234567890123"
> :shelf 3.
>
> In order to later create a more structured view, wouldn't you have
> inference rules that said something like,
>
> { ?x a :BiblioThing .
> ?x ?p ?o .
> ?p subPropertyOf :WorkishProperty } =>
> { _:work a :Work . _:work ?p ?o } .
If you look at where the RDA elements have been defined
(http://metadataregistry.org/rdabrowse.htm), each Group 1 property is
associated with Class W,E,M, or I. So the WEMI definition is in the
Class/Property definitions. Whether or not you can easily turn that
into a set of inference rules, I don't know. I think that for metadata
sets as large and complex as RDA and other library metadata, we may
be relying on logic provided by programs, not solely the RDF
structuring, but I leave that to the code writers to figure out.
> I like the proposal earlier in this thread to represent the data
> in WEMI structure but to use blank nodes to avoid having to worry
> too much in advance about URIs (that will almost always have to be
> deduped anyways, so why not dedup the blank nodes?) and let a
> standard entailment supply the missing bits.
OK, let me try this again :-). It's not just a question of "missing
bits." The bits are all there, but they are not separated out into
WEMI. And it's possible that they cannot be separated out into WEMI
without human intervention (esp. for Expressions). So any use of
frbr:Expression or frbr:Manifestation will be erroneous in terms of
the properties that you associate with that frbr entity. In other
words, they will be wrong, but in unpredictable ways. I am arguing
that if you call something a frbr:Manifestation it should really *be*
a frbr:Manifestation, not just something sort of like a
frbr:Manifestation but also kind of like a frbr:Expression and maybe
with some bits of frbr:Work. I think that mis-coding data is going to
lead to problems in the future, not unlike the mis-use of owl:sameAs
that is being discussed.
>
> What I did with bibliographica was make an entity, called MarcRecord
> that had all the fields that a MARC record might have. Then run a
> process on it (described as an opmv:Process) that generated a WEMI
> structure (actually I left out the E).
Which unfortunately means that probably either the W or the M is
mis-coded. BTW, leaving out the E seems to be a fairly common decision
because no one can really figure out what properties should go there.
Until that gets clarified, we can't expect any two parties to create
inter-relating frbr-ized descriptions.
> The process is a bit
> idiosyncratic
Idiosyncracy is exactly what I fear we will end up with.
> So building blocks,
>
> MARC21 Record -> MarcRecord as RDF (transliteration) -> W(E)MI Thing
>
> The first "->" is easy to specify (relatively) and could be the
> subject of some LLD vocabulary and guidance, and likewise for
> other source formats.
I have started looking into that first "->" and it isn't turning out
to be as easy as I would like. I'm right now working on the fixed
fields in MARC because those are relatively easy. The variable fields
and the indicators are going to require some decision-making, things
like: where do you divide author and title in an author/title field?
and how to you keep them together as a unit? does an indicator "trace"
result in a new property compared to "do not trace"? What to do with
linkage subfields or materials specified subfields? etc. etc.
I have a database with all of the MARC21 fields and subfields and
indicators, all of the fixed field values (both as codes and terms).
[Actually, I may be one update behind, but will fix that.] I don't
think I have all of the indicator values, but I need to look at that.
I am currently filling in the fixed fields with names and display
forms (tentative, of course). I've got an idea for linking them back
to MARC21 from RDF, and a few options for URIs. I'll see if I can't
find time to get it into good enough shape to make it available for
comment. Oh, and I grabbed the "marc21.info" domain for the next five
years -- I hope it doesn't take me that long to make something out of
it! :-)
kc
>
> The second "->" is much harder, more controversial, subject to
> choices and cataloguing rules.
>
> If you were to have an intermediate step,
>
> MarcRecord as RDF -> GenericFlatBiblioRecord -> W(E)MI Thing
>
> I think this is more or less what you are suggesting. In this case,
> the first "->" is probably pretty easy. The second "->" is still
> hard.
>
> I don't think we can skip the "MarcRecord as RDF" step without
> destroying provenance information.
>
>> class affiliation rather than actual structure does not mean that
>> applications could not take advantage of efficiencies such as allowing
>> catalogers to copy Work or Expression information from other
>> bibliographic descriptions to a new bibliographic entry. The proof of
>> this is that systems (WorldCat; Open Library) have been able to create a
>> Work "view" while maintaining the traditional bibliographic records in
>> their databases. I can imagine WEMI being abstracted from complete or
>> incomplete bibliographic descriptions and used as linked data. I am less
>> able to imagine WEMI as our data structure for library and other
>> bibliographic systems, at least at this moment in time.
>
> I think the "class afiliation" is the second "->". I fear
> it will be very hard to pick apart exactly what this operation
> does without going into things that RDF (i.e. FOPL, DL, N3,
> stratified datalog) cannot express. WorldCat, OpenLibrary,
> have custom code (not written in RDF!) that does this. Maybe
> that is ok, but if that is our conclusion we should be clear
> about it.
>
> Cheers,
> -w
>
> --
> William Waites <william.waites@okfn.org>
> Mob: +44 789 798 9965 Open Knowledge Foundation
> Fax: +44 131 464 4948 Edinburgh, UK
>
> RDF Indexing, Clustering and Inferencing in Python
> http://ordf.org/
>
>
--
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
Received on Wednesday, 22 September 2010 00:16:34 UTC