Re: Open Library and RDF from Karen Coyle on 2010-08-16 (public-lld@w3.org from August 2010)

From: Karen Coyle <kcoyle@kcoyle.net>
Date: Mon, 16 Aug 2010 07:36:59 -0700
To: Thomas Baker <tbaker@tbaker.de>
Cc: "gordon@gordondunsire.com" <gordon@gordondunsire.com>, "Young,Jeff (OR)" <jyoung@oclc.org>, public-lld@w3.org
Message-ID: <20100816073659.yz56c8qkcg8k4sg8@kcoyle.net>
Quoting Thomas Baker <tbaker@tbaker.de>:


>
> (I sometimes wonder if it is still optimally efficient, in
> 2010, to create lots of redundant copies of catalog records
> in lots of local databases instead of just linking to a
> central record, but that would be a different discussion...)

That is a discussion that is taking place in the library community. It  
has both conceptual and practical difficulties, as you might imagine.

> Another way is by strongly controlling the consistency of
> data when it is created -- e.g., with application profiles,
> using criteria that can form the basis of syntactic validation,
> quality control, and consistency checks (and of course with
> training of the catalogers in the proper application of the
> conceptual system).  However, for the data to be good and
> consistent, it does not follow that the underlying vocabularies
> themselves must necessarily carry heavy ontological baggage.

There is undoubtedly a sweet spot between vocabulary precision and  
metadata interoperability. The thing is that we will NOT solve this  
problem in the LLDWG, and therefore it is probably best to assume that  
the use of LD will provide situations that permit the library  
community to re-think some of its practices. Because of the  
inter-dependency of libraries around metadata, change will probably be  
slow because it will affect the actual functioning of the institutions  
themselves. Meanwhile, we must work with the library data that exists  
(and there is a huge amount of it).

Application profiles is a topic that Diane Hillmann and I cover when  
we speak to library groups, and that she and Jon Phipps have tried to  
explain to the developers of the JSC ad nauseum. It will eventually  
catch on, and we plan to do some demonstrations using the RDA data  
that is there.



>
> I agree that this is the challenge, and a layered approach
> sounds reasonable.  Is this the approach currently being followed
> by the FR and RDA committees?

No. In part, it is because their task is to create models and rules  
for the library community, a big job on its own. But I think another  
factor is that there is no one for them to talk to outside of the  
library community -- no one who understands their data well enough to  
speak to them. I really encourage anyone interested in interfacing  
with libraries to put forth the effort to learn as much as possible  
about library data. There is a good reason why the cataloging rules  
take up a 600 page book -- there is a huge wealth of knowledge there,  
and about two centuries of experience with bibliographic data and with  
naming. There is undoubtedly no other community that has a full page  
of instructions for the recording of the names and titles of "Buddhist  
monarchs, ecclesiastics and patriarchs" (rule 22.28.D1). Libraries  
need to find people with a deep knowledge of bibliographic data to  
work with.



> My question is whether the FR and RDA process is considering
> that some of the desired precision might be defined not in
> the underlying vocabularies, but in application profiles that
> use those vocabularies.  An approach which pushes some of the
> precision into application profiles could provide flexibility
> without sacrificing rigor.  Are application profiles (possibly
> under a different name) an important part of the discussion?


One of the difficulties we face in the library community is a deep  
chasm between the cataloging community and the systems community. RDA  
and the FRs are being developed by the cataloging community, and no  
data modelers were involved in the process. (You can ask Diane how  
frustrating this is.) The catalogers claim that the systems folk don't  
understand cataloging, and the systems folk claim that the catalogers  
do not understand systems. This is a huge problem, and one that some  
of us have been struggling with for nearly all of our careers (in my  
case, 30 years now). It's not at all the case that we haven't noticed  
the issue -- for some of us, it occupies our every professional  
moment. The meeting that (I believe) you attended in London between DC  
and RDA was an important attempt to bridge that gap, but in the end it  
did not go far enough. Gordon and Diane can give more detail on what  
fell through, but I have been in some of the conversations and the  
distance between catalogers and data modelers is still very large.

I give something like 6-10 talks a year on the topic of "the future of  
library data" (aka SemWeb, aka LD). Each time, I think I convert  
another handful of librarians. It's slow going, and it's only because  
some of us are doggedly determined that we've made the progress we  
have. There are people in our own community who deride us publicly for  
being duped by the latest fad.

It's not easy going, as you can see, which is by way of an apology for  
my getting sometimes testy at the suggestions that come along. Yes,  
we've thought about it, and thought about it all very hard. For a long  
time.

Maybe what this group needs as a kind of "state of the community"  
report -- with contributions from different countries and regions. (I  
do not know enough myself about how other library communities are  
viewing RDA and LD.)

kc


-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
Received on Monday, 16 August 2010 14:37:39 UTC