AW: [Spam-Wahrscheinlichkeit=45]Re: LD and Redundancy

I also agree to the point that we unfortunately have to deal with redundancies. 

But for the report IMO we should strictly differentiate between authority data and bibliographic data. I reckon we can suggest a centralization of authority data at least on national level. I assume a downsizing of bibliographic redundancies needs consolidated authority data and of course the consequent alignment of authorities in bibliographic entries. 

Without raising the FRBR discussion again, I think one of the redundancy reasons in bibliographic data is the lack of possibilities to link items with trustworthy bibliographic records. Everyone is creating own new bibliographic records - but this is also caused by the harvesting approach in current library environments and probably not changeable soon.

And a last point: for the identification and decrease of redundancies it's helpful to have standardized ontologies for the library community - maybe not only RDA but definitely not more than a handful... 

alex


> -----Ursprüngliche Nachricht-----
> Von: public-lld-request@w3.org [mailto:public-lld-request@w3.org] Im Auftrag von
> Karen Coyle
> Gesendet: Donnerstag, 24. März 2011 01:11
> An: Ross Singer
> Cc: public-lld
> Betreff: [Spam-Wahrscheinlichkeit=45]Re: LD and Redundancy
> 
> Following up to this, we seem to agree that there will be redundancy
> of data and of identifiers. Is this a particular LLD issue that should
> be included in the group's report, or is this a general SemWeb issue
> that we can assume will be addressed in the normal course of things?
> At the moment there is a brief mention of this in the issues area of
> the report, but we're unsure what to say about it.
> 
> Perhaps we can resolve this on tomorrow's call.
> 
> Thanks, all,
> kc
> 
> Quoting Ross Singer <ross.singer@talis.com>:
> 
> > I think we're going to have to assume there will be lots of duplication of
> > resources describing the same thing with different identifiers (although,
> > hopefully interrelated) for a couple of reasons:
> >
> > 1) A centralized repository will never be able to keep up with everything -
> > there will always be nodes with resources described prior to being added to
> > the repository; possibly never added.  These could also spring up in
> > multiple places independently
> > 2) We should not expect universal, 100% agreement on how things are
> > defined/described.  We don't have this now, we certainly can't expect this
> > to change.
> > 3) There are lots of non-authoritative resources (subject headings, people,
> > class numbers, etc.)
> > 4) A centralized repository would have to rely quite heavily on discovery
> >     - there's a huge danger of GIGO here (there are plenty of typos in the
> > historical record)
> >     - plenty of chances of failed searches
> >
> > Couple this to the fact that (most) everybody is going to to have to
> > duplicate all of the data for local indexing purposes, anyway...
> >
> > -Ross.
> >
> > On Wed, Mar 23, 2011 at 3:37 PM, Owen Stephens <owen@ostephens.com>
> wrote:
> >
> >> I tend to agree with Joachim - we will see more data publication and at
> >> least in this phase will see plenty of institutions coining their own URIs.
> >> However, I also believe that the web tends towards less duplication (this
> >> isn't anything close to no duplication, just less duplication than we would
> >> have otherwise).
> >>
> >> We are already seeing that established URIs will be used where they exist
> >> (e.g. for LCSH) - and I guess we can expect to see more of these.
> >>
> >> That said, I think aggregations are a good thing (and inevitable) - and the
> >> more identifiers are shared, and the more people make sameas and similar
> >> statements, the easier aggregation will become.
> >>
> >> In terms of what we should be doing now? I'd say:
> >>
> >> Encourage re-use of URIs (ideally this would be baked into record creation
> >> in libraries, but that's a whole other ball game)
> >> Encourage sameas statements where new URIs have been coined (and
> >> appropriate)
> >> Start looking at how existing linked data representations of bibliographic
> >> data can be crawled and aggregated and see what works and what doesn't
> >>
> >> I'm sure there is other stuff, but those are the ones that spring to mind
> >> first
> >>
> >> The work of the JISC 'RDTF' (Resource Discovery Task Force) in the UK is
> >> looking at the strategy of 'publish' and 'aggregate' - although this doesn't
> >> dictate the use of Linked Data or RDF, many of the project falling into this
> >> area are adopting that approach, so hopefully we will see a good exploration
> >> of some of the issues from this area soon. See http://rdtf.mimas.ac.uk/ for
> >> more information on this.
> >>
> >> Owen
> >>
> >>
> >> Owen Stephens
> >> Owen Stephens Consulting
> >> Web: http://www.ostephens.com
> >> Email: owen@ostephens.com
> >> Telephone: 0121 288 6936
> >>
> >> On 23 Mar 2011, at 17:16, stu wrote:
> >>
> >> *On Thu, Mar 24, 2011 at 1:18 AM, Neubert Joachim
> <J.Neubert@zbw.eu>wrote:
> >>
> >> I'm not sure that a centralized model for building clusters (like VIAF) or
> >> a pre-declared central hub ("everybody maps to
> >> WorldCat/OpenLibrary/whatever") could work.*
> >>
> >> A centralized model is essential if global bibliography is to be an
> >> important part of the Web.  Sure, there are work-arounds involving declared
> >> or inferred equivalence.  These all require additional work on the part of
> >> systems and people, which will rarely be expended, with the result that link
> >> potency will (continue to) be diluted to insignificance.
> >>
> >> Is it important enough for the global library community to expend the
> >> resources to consolidate meaningful global bibliography?  Can the political
> >> impediments be overcome?
> >>
> >> I continue to believe that OCLC is the only likely candidate with a chance
> >> to make this happen, and it appears that the business cases are too weak,
> >> and constituent demand too feeble for that to happen in the current
> >> environment.
> >>
> >> I just Googled the book closest to hand, and on the first page, Wikipedia
> >> was number one, and there were two Amazon links in the top ten.  No library
> >> link of any sort appeared on the page.
> >>
> >> Linked data isn't going to change this without a centralized identifier
> >> infrastructure.
> >>
> >> stu
> >>
> >>
> >>
> >>>
> >>
> >>
> >
> 
> 
> 
> --
> Karen Coyle
> kcoyle@kcoyle.net http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet
> 

Received on Thursday, 24 March 2011 07:18:22 UTC