Re: Brainstorming: Key Issues from Karen Coyle on 2011-02-22 (public-xg-lld@w3.org from February 2011)

From: Karen Coyle <kcoyle@kcoyle.net>
Date: Tue, 22 Feb 2011 07:11:27 -0800
To: Ross Singer <ross.singer@talis.com>
Cc: "public-xg-lld@w3.org" <public-xg-lld@w3.org>
Message-ID: <20110222071127.51755fnx8a6dt1jz@kcoyle.net>
Great, Ross, thanks. I think we are getting somewhere.

Looking at the issues section of the report (based on the use cases  
and what we have so far in the discussion) I am ready to begin to  
create some categories. The major themes I see so far (with a few  
sample particulars) are:

Lack of library standards and guidance in this area
  - "When ongologies/metadata schemas/vocabularies overlap, which  
should be used?"
  - Need for study of SKOS for authorities
  - "no community guidance on which technologies and vocabularies to use"

Legacy data
  - Incomplete vocabularies for legacy data
  - Legacy data itself is often text, not data, and does not include  
relationships
  - how will the mass of legacy data be transformed?

Immaturity of Semantic Web at this Time
  - "There is a general sparseness of linkage in the LOD cloud."
  - Over-use or misuse of properties like OWL sameAs
  - Lack of generalized tools for creation and use of LD

Readiness of Library Community (Education)
  - "Publishing Linked Data requires expertise which is often not  
available at institutions..."
  - Changing mental model from "records" to "graphs"

Applications and Management
  - (what Ross said)
  - Managing a heterogeneous metadata environment (libraries now are  
more homogeneous)

Obviously more themes can be added as we go along.

I'm not sure where to put this, but I think that yet another wiki page  
will be needed for the drafting, *OR* I could do it within the draft  
report. My usual approach to this is to throw in all of the comments  
and statements in a section called "raw" and above that copy them into  
an area called "cooked". So, separate page or within report?

kc


Quoting Ross Singer <ross.singer@talis.com>:

> Here are some other ideas, some related to Karen's:
>
> 1) Where to start?  To convert a dataset of any significant size, we'll need
> name authorities, subject thesauri, controlled vocabulary terms, etc.  If
> everyone does this in isolation, minting their own URIs, etc., how is this
> any better than silos of MARC records?  How do institutions the size of
> University of Michigan or Stanford get access to datasets such as VIAF so
> they don't have to do millions of requests every time they remodel their
> data?  How do they know which dataset to look in for a particular value?
> What about all of the data that won't be found in centralized datasets
> (local subject headings, subject headings based on authorities with floating
> terms, names not in the NAF, etc.)?
>
> 2) How do we keep the original data and linked data in sync?  If changes
> happen to the linked data representation, how do we funnel that back into
> the original representation?  Do we even want to?
>
> 3) The richer the data, the more complicated the dependencies: how do we
> prevent rats nests of possible licensing issues (Karen raised this, as
> well)?  Similarly, this web also creates an n+1 problem: there's always the
> potential of a new URIs being introduced with each graph; how much is
> enough?  How will a library know?
>
> 4) How do we deal with incorrect data that we don't own/manage?
>
> 5) As the graph around a particular resource improves in quality, how do
> these changes propagate around to the various copies of the data?  How do
> libraries deal with the changes (not only regarding conflicts, but how to
> keep up with changes in the data model, with regard to indexing, etc.)?
>
> 6) Piggybacking on Karen's "chicken or the egg" problem, who will be first
> to take the plunge?  What is the benefit for them to do so?  In the absence
> of standards, will their experience have any influence on how standards are
> created (that is, will they go through the work only to have to later retool
> everything)?
>
> -Ross.
>
> On Thu, Feb 17, 2011 at 12:26 PM, Karen Coyle <kcoyle@kcoyle.net> wrote:
>
>> This is my kick-off for brainstorming and key issues. I'd suggest that
>> for the first go-round we not worry about structure or levels of
>> granularity but just throw out ideas. I'll do my best to keep track
>> and we can then come back and have a more coordinated discussion.
>>
>> Karen's list:
>>
>> 1) Community agreement and leadership
>>  There are many in the community who are either not interested in
>> LLD, don't know about LLD, or who are actually opposed to LLD. At the
>> moment, there are no centers of leadership to facilitate such a major
>> change to library thinking about its data (although IFLA is probably
>> the most active).
>>
>> 2) Funding
>>  It is still quite difficult to convince potential funders that this
>> is an important area to be working in. This is the "chicken/egg"
>> problem, that without something to show funders, you can't get funding.
>>
>> 3) Legacy data
>>  The library world has an enormous cache of data that is somewhat
>> standardized but uses an antiquated concept of data and data modeling.
>> Transformation of this data will take coordination (since libraries
>> share data and systems for data creation). But before it can be
>> transformed it needs to be analyzed and there must be a plan for
>> converting it to linked data. (There is a need for library systems to
>> be part of this change, and that is very complex.)
>>
>> 4) Openness and rights issues
>>  While linked data can be used in an enterprise system, the value
>> for libraries is to encourage open use of bibliographic data.
>> Institutions that "own" bibliographic data may be under constraints,
>> legal or otherwise, that do not allow them to let their data be used
>> openly. We need to overcome this out-dated concept of data ownership.
>>
>> 5) Standards
>>  Libraries need to take advantage of the economies of scale that
>> data sharing afford. This means that libraries will need to apply
>> standards to their data for use within libraries and library systems.
>>
>> You can comment on these and/or post your own. Don't think about it
>> too hard -- let's get as many issues on the table as we can! (I did 5
>> - you can do any number you wish.)
>>
>> kc
>>
>> --
>> Karen Coyle
>> kcoyle@kcoyle.net http://kcoyle.net
>> ph: 1-510-540-7596
>> m: 1-510-435-8234
>> skype: kcoylenet
>>
>



-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
Received on Tuesday, 22 February 2011 15:12:01 UTC