Re: Brainstorming: Key Issues from Ross Singer on 2011-02-22 (public-xg-lld@w3.org from February 2011)

From: Ross Singer <ross.singer@talis.com>
Date: Tue, 22 Feb 2011 06:19:17 -0500
To: Karen Coyle <kcoyle@kcoyle.net>
Cc: "public-xg-lld@w3.org" <public-xg-lld@w3.org>
Message-ID: <AANLkTin6B1yWQbD53VnAer_YVeoKZBa-e82_ycmp2XOf@mail.gmail.com>
Here are some other ideas, some related to Karen's:

1) Where to start?  To convert a dataset of any significant size, we'll need
name authorities, subject thesauri, controlled vocabulary terms, etc.  If
everyone does this in isolation, minting their own URIs, etc., how is this
any better than silos of MARC records?  How do institutions the size of
University of Michigan or Stanford get access to datasets such as VIAF so
they don't have to do millions of requests every time they remodel their
data?  How do they know which dataset to look in for a particular value?
What about all of the data that won't be found in centralized datasets
(local subject headings, subject headings based on authorities with floating
terms, names not in the NAF, etc.)?

2) How do we keep the original data and linked data in sync?  If changes
happen to the linked data representation, how do we funnel that back into
the original representation?  Do we even want to?

3) The richer the data, the more complicated the dependencies: how do we
prevent rats nests of possible licensing issues (Karen raised this, as
well)?  Similarly, this web also creates an n+1 problem: there's always the
potential of a new URIs being introduced with each graph; how much is
enough?  How will a library know?

4) How do we deal with incorrect data that we don't own/manage?

5) As the graph around a particular resource improves in quality, how do
these changes propagate around to the various copies of the data?  How do
libraries deal with the changes (not only regarding conflicts, but how to
keep up with changes in the data model, with regard to indexing, etc.)?

6) Piggybacking on Karen's "chicken or the egg" problem, who will be first
to take the plunge?  What is the benefit for them to do so?  In the absence
of standards, will their experience have any influence on how standards are
created (that is, will they go through the work only to have to later retool
everything)?

-Ross.

On Thu, Feb 17, 2011 at 12:26 PM, Karen Coyle <kcoyle@kcoyle.net> wrote:

> This is my kick-off for brainstorming and key issues. I'd suggest that
> for the first go-round we not worry about structure or levels of
> granularity but just throw out ideas. I'll do my best to keep track
> and we can then come back and have a more coordinated discussion.
>
> Karen's list:
>
> 1) Community agreement and leadership
>  There are many in the community who are either not interested in
> LLD, don't know about LLD, or who are actually opposed to LLD. At the
> moment, there are no centers of leadership to facilitate such a major
> change to library thinking about its data (although IFLA is probably
> the most active).
>
> 2) Funding
>  It is still quite difficult to convince potential funders that this
> is an important area to be working in. This is the "chicken/egg"
> problem, that without something to show funders, you can't get funding.
>
> 3) Legacy data
>  The library world has an enormous cache of data that is somewhat
> standardized but uses an antiquated concept of data and data modeling.
> Transformation of this data will take coordination (since libraries
> share data and systems for data creation). But before it can be
> transformed it needs to be analyzed and there must be a plan for
> converting it to linked data. (There is a need for library systems to
> be part of this change, and that is very complex.)
>
> 4) Openness and rights issues
>  While linked data can be used in an enterprise system, the value
> for libraries is to encourage open use of bibliographic data.
> Institutions that "own" bibliographic data may be under constraints,
> legal or otherwise, that do not allow them to let their data be used
> openly. We need to overcome this out-dated concept of data ownership.
>
> 5) Standards
>  Libraries need to take advantage of the economies of scale that
> data sharing afford. This means that libraries will need to apply
> standards to their data for use within libraries and library systems.
>
> You can comment on these and/or post your own. Don't think about it
> too hard -- let's get as many issues on the table as we can! (I did 5
> - you can do any number you wish.)
>
> kc
>
> --
> Karen Coyle
> kcoyle@kcoyle.net http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet
>
Received on Tuesday, 22 February 2011 11:19:51 UTC