- From: Stuart Weibel <stuart.weibel@gmail.com>
- Date: Thu, 24 Mar 2011 09:24:34 +0900
- To: Karen Coyle <kcoyle@kcoyle.net>
- Cc: Ross Singer <ross.singer@talis.com>, public-lld <public-lld@w3.org>
It's a fundamental issue of primary importance to libraries. It should be discussed in the report Sent from my iPhone On Mar 24, 2011, at 9:10 AM, Karen Coyle <kcoyle@kcoyle.net> wrote: > Following up to this, we seem to agree that there will be redundancy of data and of identifiers. Is this a particular LLD issue that should be included in the group's report, or is this a general SemWeb issue that we can assume will be addressed in the normal course of things? At the moment there is a brief mention of this in the issues area of the report, but we're unsure what to say about it. > > Perhaps we can resolve this on tomorrow's call. > > Thanks, all, > kc > > Quoting Ross Singer <ross.singer@talis.com>: > >> I think we're going to have to assume there will be lots of duplication of >> resources describing the same thing with different identifiers (although, >> hopefully interrelated) for a couple of reasons: >> >> 1) A centralized repository will never be able to keep up with everything - >> there will always be nodes with resources described prior to being added to >> the repository; possibly never added. These could also spring up in >> multiple places independently >> 2) We should not expect universal, 100% agreement on how things are >> defined/described. We don't have this now, we certainly can't expect this >> to change. >> 3) There are lots of non-authoritative resources (subject headings, people, >> class numbers, etc.) >> 4) A centralized repository would have to rely quite heavily on discovery >> - there's a huge danger of GIGO here (there are plenty of typos in the >> historical record) >> - plenty of chances of failed searches >> >> Couple this to the fact that (most) everybody is going to to have to >> duplicate all of the data for local indexing purposes, anyway... >> >> -Ross. >> >> On Wed, Mar 23, 2011 at 3:37 PM, Owen Stephens <owen@ostephens.com> wrote: >> >>> I tend to agree with Joachim - we will see more data publication and at >>> least in this phase will see plenty of institutions coining their own URIs. >>> However, I also believe that the web tends towards less duplication (this >>> isn't anything close to no duplication, just less duplication than we would >>> have otherwise). >>> >>> We are already seeing that established URIs will be used where they exist >>> (e.g. for LCSH) - and I guess we can expect to see more of these. >>> >>> That said, I think aggregations are a good thing (and inevitable) - and the >>> more identifiers are shared, and the more people make sameas and similar >>> statements, the easier aggregation will become. >>> >>> In terms of what we should be doing now? I'd say: >>> >>> Encourage re-use of URIs (ideally this would be baked into record creation >>> in libraries, but that's a whole other ball game) >>> Encourage sameas statements where new URIs have been coined (and >>> appropriate) >>> Start looking at how existing linked data representations of bibliographic >>> data can be crawled and aggregated and see what works and what doesn't >>> >>> I'm sure there is other stuff, but those are the ones that spring to mind >>> first >>> >>> The work of the JISC 'RDTF' (Resource Discovery Task Force) in the UK is >>> looking at the strategy of 'publish' and 'aggregate' - although this doesn't >>> dictate the use of Linked Data or RDF, many of the project falling into this >>> area are adopting that approach, so hopefully we will see a good exploration >>> of some of the issues from this area soon. See http://rdtf.mimas.ac.uk/ for >>> more information on this. >>> >>> Owen >>> >>> >>> Owen Stephens >>> Owen Stephens Consulting >>> Web: http://www.ostephens.com >>> Email: owen@ostephens.com >>> Telephone: 0121 288 6936 >>> >>> On 23 Mar 2011, at 17:16, stu wrote: >>> >>> *On Thu, Mar 24, 2011 at 1:18 AM, Neubert Joachim <J.Neubert@zbw.eu>wrote: >>> >>> I'm not sure that a centralized model for building clusters (like VIAF) or >>> a pre-declared central hub ("everybody maps to >>> WorldCat/OpenLibrary/whatever") could work.* >>> >>> A centralized model is essential if global bibliography is to be an >>> important part of the Web. Sure, there are work-arounds involving declared >>> or inferred equivalence. These all require additional work on the part of >>> systems and people, which will rarely be expended, with the result that link >>> potency will (continue to) be diluted to insignificance. >>> >>> Is it important enough for the global library community to expend the >>> resources to consolidate meaningful global bibliography? Can the political >>> impediments be overcome? >>> >>> I continue to believe that OCLC is the only likely candidate with a chance >>> to make this happen, and it appears that the business cases are too weak, >>> and constituent demand too feeble for that to happen in the current >>> environment. >>> >>> I just Googled the book closest to hand, and on the first page, Wikipedia >>> was number one, and there were two Amazon links in the top ten. No library >>> link of any sort appeared on the page. >>> >>> Linked data isn't going to change this without a centralized identifier >>> infrastructure. >>> >>> stu >>> >>> >>> >>>> >>> >>> >> > > > > -- > Karen Coyle > kcoyle@kcoyle.net http://kcoyle.net > ph: 1-510-540-7596 > m: 1-510-435-8234 > skype: kcoylenet > >
Received on Thursday, 24 March 2011 00:25:43 UTC