Re: Open Library and RDF from Thomas Baker on 2010-08-16 (public-lld@w3.org from August 2010)

From: Thomas Baker <tbaker@tbaker.de>
Date: Mon, 16 Aug 2010 13:16:19 -0400
To: Karen Coyle <kcoyle@kcoyle.net>
Cc: "gordon@gordondunsire.com" <gordon@gordondunsire.com>, "Young,Jeff (OR)" <jyoung@oclc.org>, public-lld@w3.org
Message-ID: <20100816171619.GA5168@octavius>
On Mon, Aug 16, 2010 at 07:36:59AM -0700, Karen Coyle wrote:
> >I agree that this is the challenge, and a layered approach
> >sounds reasonable.  Is this the approach currently being followed
> >by the FR and RDA committees?
> 
> No. In part, it is because their task is to create models and rules  
> for the library community, a big job on its own. But I think another  
> factor is that there is no one for them to talk to outside of the  
> library community -- no one who understands their data well enough to  
> speak to them. I really encourage anyone interested in interfacing  
> with libraries to put forth the effort to learn as much as possible  
> about library data. There is a good reason why the cataloging rules  
> take up a 600 page book -- there is a huge wealth of knowledge there,  
> and about two centuries of experience with bibliographic data and with  
> naming. There is undoubtedly no other community that has a full page  
> of instructions for the recording of the names and titles of "Buddhist  
> monarchs, ecclesiastics and patriarchs" (rule 22.28.D1). Libraries  
> need to find people with a deep knowledge of bibliographic data to  
> work with.

I take the point, yet I would hope the problem could be
partitioned to put alot of that detail out of scope for the
conversation with the broader community outside libraries.
Strings like:

    3 transparencies (15 overlays) : b&w ; 26 x 22 cm

could surely be modeled in RDF, in principle, but would
it be worth the cost in terms of model complexity, not to
mention the effort?  And would a triple expression of the
above be significantly more usable than the original string?
(And might it even be _less_ usable?)

As for the sort of collaboration needed to "do it right", I
like to think of the team effort by Richard Pevear (American)
and Larissa Volokhonsky (Russian) to translate War and Peace -
NPR has a nice short piece about this [1].

Volokhonsky wrote first drafts and Pevear "addressed their
English-ness" with probing questions about the original.
In the end, the translation was the result of a meeting of 
two linguistic minds.

As I see it, the task of creating RDF vocabularies for FR
and RDA is, by perfect analogy, one of translation.  And to
get the translation right, I suspect that a similar sort of
dialog would be be needed, with someone to translate back
to the cataloging experts what the draft formal model is
actually saying ("The formal model says X - is that really
the intended meaning?").

I'm not convinced that the missing counterpart from the
modeling world need come to the task with deep knowledge about
minutiae.  That would set the bar so high that the task might
never get done.  Rather, the requirement is more for someone
who can listen very well, communicate a modeling perspective
back to the bibliographic experts, and engage in a dialog,
the purpose of which would be to jointly produce an accurate,
readable, and elegant translation.

[1] http://www.npr.org/templates/story/story.php?storyId=15528712

> One of the difficulties we face in the library community is a deep  
> chasm between the cataloging community and the systems community. RDA  
> and the FRs are being developed by the cataloging community, and no  
> data modelers were involved in the process. 

My impression is that the situation is even more difficult.
I picture the situation as follows (and look forward to seeing 
this view corrected or nuanced):

In one corner, catalogers, with a deep understanding of their
conceptual models, but little or no training in Semantic Web
modeling per se, little understanding of software development,
and little budget.  In the other corner, systems people,
considerably younger in average age and experience, often
oriented heavily to APIs and to ad-hoc data models for solving
a problem at hand, likewise with little training in Semantic
Web modeling, and with little motivation to make extra work
for themselves by pushing the issue of data interoperability,
beyond the task at hand, on their own initiative.  I have
the impression that there are precious few "data modelers"
(in a Semantic Web sense) involved in the process at all.
And I'm not getting the sense that the catalogers are
articulating a strong requirement for interoperability on a
Semantic Web basis.  Result: requirements are defined for a
data silo, and programmers deliver a silo.

Discuss... :-)

Tom

-- 
Thomas Baker <tbaker@tbaker.de>
Received on Monday, 16 August 2010 17:17:00 UTC