- From: Antoine Isaac <aisaac@few.vu.nl>
- Date: Sun, 19 Sep 2010 20:22:00 +0200
- To: Karen Coyle <kcoyle@kcoyle.net>
- CC: public-lld <public-lld@w3.org>
Good points, Karen. By the way there is one project on the topic, http://blogs.ukoln.ac.uk/locah/. In Europeana we also have some efforts on converting EAD (one archive standard) data to RDF, but as linked data is not (yet?) the focus of these I will bother you with it only if you're interested ;-) Cheers, Antoine > Joachim, > > What you describe is fairly common in archives, although many of them > have even less information: all they know is that they have someone's > papers, and the number of boxes they occupy. So this kind of case is a > good case study for all of that kind of archival material. The exciting > thing about archives is that they often have material that does not > exist anywhere else, so their resources are very valuable, but extremely > hard to find. It would be great to show that linked data can help make > archives more visible. > > kc > > Quoting Neubert Joachim <J.Neubert@zbw.eu>: > >> Hi Antoine, >> >> thank you for asking your question again, and sorry for don't getting >> you in the telecon. I'll happily try to explain the strange beast >> press archives. (For me, working for more than 20 years with press >> archives, it's maybe too familiar and self-evident - while RDA and >> FRBR are sometimes still looking quite strange to me ...) >> >> I'm talking here about a classical newspaper clippings archive like >> 20th Century Press Archives. This is different from Eds use case, >> which deals with complete issues and pages of the newspaper. In a >> clippings archives, you find large numbers of single clippings pasted >> on loose sheets of paper, with the publication date and the newspaper >> title scribed or stamped onto the sheet. These sheets (and >> ocassionally other material, like annual reports of companies) were >> collected in thematic folders, year over year, normally putting the >> most current clipping on top of the pile. In the past, there was no >> possibility to access clippings by author, by title, or by any other >> attribute. The folders were arranged simply alphabetically (wherever >> possible), or in some kind of classification (we'll have to delve into >> this for the subject and wares archives). Generally, there exists no >> card catalog, and there is no such concept as a "bibliographic unit". >> It would have been much too expensive to capture for every single >> clipping, and it wasn't applied to folders either. What mattered was >> the collection, and within the collection the order of folders on the >> shelf and the order of clippings within the folder. (That's why >> OAI-ORE with its quite generic concept of an aggregation looked like a >> natual fit to me here.) >> >> Anyway, legacy metadata is almost non-existant. For some 20,000 >> clippings we have additional metadata now, but this was transcribed >> from the sheets in the process of digitizing the material, and I doubt >> that it's affordable to add much more. Maybe users could someday add >> newspaper titles and publication dates for clippings they actually >> work with - but this will remain to be very sparse, regarding the size >> of the complete archives with its 30 million documents. >> >> The good news is that we *can* apply metadata to the folder level. We >> did this for the personal name authority identifier, and can therefore >> pull in data from there and - via a DBpedia mapping - from the Linked >> Data Cloud. Doing the same for companies would be great. And the other >> way arround, we add a lot to the Cloud: These thematic folders are a >> unique source of historical knowledge and contemporary points of view >> about almost every issue that was discussed publicly in the 20th century. >> >> In library land this kind of material will remain a very special case. >> But it is part of our cultural heritage, so I think we have to make it >> accessible with the best methods we can figure out. >> >> Cheers, Joachim >> >> -----Ursprüngliche Nachricht----- >> Von: public-lld-request@w3.org [mailto:public-lld-request@w3.org] Im >> Auftrag von Antoine Isaac >> Gesendet: Samstag, 18. September 2010 16:42 >> An: public-lld >> Betreff: Question on press use cases >> >> Hi Ed, Joachim, >> >> I'm posting the question on your two use cases [1,2] I could not >> really ask in last week's telecon [3]. >> >> The data that is published in your cases is pretty much semantic >> web-oriented, mostly looking at the vocabularies you use: DC, OAI-ORE, >> FOAF, EXIF, BIBO. There's some RDA/FRBR at [1] but not much. And [2] >> links to METS records, but rather as a side resource, not a true >> linked data description. >> >> I'm myself pretty happy with that situation--I trust this can be >> really useful data as such already. But with my LLD hat on I'd like to >> know more ;-) >> >> So the question is whether the current situation results rather from: >> - a conscious choice of ignoring part of the legacy data you had in >> the original data sources, in the light of the requirements of your >> scenarios? >> - a too great effort needed to move legacy data to linked data, >> considering the resources you had? >> - the lack of legacy data--you just converted all what you had? >> >> Cheers, >> >> Antoine >> >> [1] >> http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_Publishing_20th_Century_Press_Archives >> >> [2] http://www.w3.org/2005/Incubator/lld/wiki/Use_Case_NDNP >> [3] >> http://www.w3.org/2005/Incubator/lld/minutes/2010/09/16-lld-minutes.html >> >> >> > > >
Received on Sunday, 19 September 2010 18:22:37 UTC