reconciliation of disparate models [was: Question about MARCXML to Models transformation] from Corey Harper on 2011-03-13 (public-lld@w3.org from March 2011)

From: Corey Harper <corey.harper@gmail.com>
Date: Sun, 13 Mar 2011 13:02:15 -0400
To: public-lld@w3.org
Message-ID: <AANLkTi=eooQthSSQBNM+xCh0pMb2ve_jQfBXOn1Q9AMi@mail.gmail.com>
Dear lld-xg,

On Friday, I started a conversation off-list following on this
discussion of the FRBR model(s), how to reconcile them with the
flatter versions of linked bib data that utilize spcs like BIBO.

After a lot of discussion, there seemed to be value in bringing the
conversation back onto the list.

In an attempt to seed a re-run of this discussion, with broader input,
I'm re-sending my original message here.

Thanks,
-Corey

---------- Forwarded message ----------
From: Corey Harper <corey.harper@gmail.com>
Date: Fri, Mar 11, 2011 at 4:49 PM
Subject: Re: Question about MARCXML to Models transformation
To: Ross Singer <rossfsinger@gmail.com>, "Young,Jeff (OR)"
<jyoung@oclc.org>, Karen Coyle <kcoyle@kcoyle.net>, Thomas Baker
<tbaker@tbaker.de>, Ed Summers <ehs@pobox.com>, "jonphipps@gmail.com"
<jonphipps@gmail.com>


Dear Tom, Karen, Ross, Ed, Jon, Jeff,

I'm emailing this off-list to a few of you, because I think that group
wants to move past this issue and focus on a report. I'm also
concerned that I must be missing something quite obvious. Let me know
if you think this merits broader discussion, and I can resend to
public-lld.

I've been thinking a lot about this question of the pros and cons of
unconstrained, generalized properties, and am increasingly convinced
that hard-coding domains and ranges into things is a significant
barrier to reuse. I very much like the superclass / generalized
superproperty approach used in the rda vocabs and suggested by Jeff &
others on this list.

One of the things I like about this approach is that it *could* have
the potential to allow multiple views of the same bibliographic data
to co-exist without any of the underlying assertions contradicting
each other. I've been thinking about this a lot lately, and I'm
convinced that we can have a fully realized FRBR view of a collection
of data, and an alternate view of that same (or similar) data made for
interoparating with non-FRBR-aware sources like BIBO, or with data
modeled with an awareness of a completely different view of FRBR. For
an example of the latter, just look to FABiO, which does put a
superclass above WEMI. Interesting in it's own right, they call that
parent class "Endeavor".

IRT the minting of URIs that may prove unnecessary, I'm curious what
the harm would be in reprenting these mysterious unidentified
expressions as blank nodes.

I really feel like these differing views can be reconsciled as flavors
of metadata output, and I don't see any reason these triples can't
coexist. What would be the problem in saying--in grossly incorrect,
borderline incoherent, n-triples that abuse the RDA Vocab:

Representation 1 - Single Description:

<http://example.org/some-book#bibo> a bibo:Book ;
   dc:creator <http://example.org/some_author> ;
   dc:title "My random title"@en ;
   dc:date "2000" ;

Representation 2 - WEM Description Set:

<http://example.org/some-book#rda> a FRBRer:Manifestation ;
  RDAVocab:dateOfDistribution "2000" ;
  FRBRer:manifestationOf _:blank1 ;

_:blank1 a FRBRer:Expression ;
  RDAVocab:dateOfExpression "2000" ;
  RDAVocab:languageOfExpression "English" ;
  FRBRer:expressionOf _:blank2 ;

_:blank2 a FRBRer:Work ;
  RDAVocab:titleOfTheWork "My random title" ;
  RDAVocab:authorWork <http://example.org/some_author ;


[The hash-URIs are almost certainly the wrong way to go about this,
but the above is really just an illustration of a point.]

I can see the establishment of consistent rules that could translate
back and forth between these, or even just let them sit at the same
URI. This is obviously the most basic of examples. There's no overhead
from minting URI's that may eventually prove useless in example two.
My assumption, though, is that I'm missing something significant
(hence my not sending to the list....)

Best,
-Corey


> On Wed, Mar 9, 2011 at 11:20 AM, Young,Jeff (OR) <jyoung@oclc.org> wrote:
>
>> One way to punt on this problem would be to treat the relationship between
>> W&M as 1-to-1 for now (80/20 rule). This would create some alias URIs for
>> Expressions and possibly conflate a few, but we could always come in later
>> and use owl:sameAs to reconcile the aliases and improve the data mining to
>> split those we conflate.
>>
>
> I'll probably be outnumbered on this, but I begin to feel somewhat
> uncomfortable to assigning massive amounts of URIs for things in the absence
> of knowing what they are.  This is further compounded by the fact that
> they're being created because we have so little data to work with.
>
> I can't help but feel there are lots of hidden costs here (persistence of
> the deprecated "stub" URIs, being one, but even just the general fact that
> you need to dereference -- and store -- an extra, not-terribly-valuable, URI
> simply to get a CBD of the Manifestation), but I also, personally, feel it's
> significantly easier to add data later, when we know with some more
> confidence what it is we're describing, than it is to edit.  Especially at
> scale.
>
> Do others perceive this to be an issue?
>
> -Ross.
Received on Sunday, 13 March 2011 21:12:06 UTC