- From: MacKenzie Smith <kenzie@MIT.EDU>
- Date: Wed, 15 Oct 2003 18:20:36 -0400
- To: www-rdf-dspace@w3.org
>I've also taken a brief look at the ArtStor corpus. Nothing by Frank >Lloyd Wright, nor I. M. Pei in their data, so I doubt that architecture is >in the corpus. If you'd like to know what's in this dataset I would take a look at their website http://www.artstor.org/collections/brief.jsp. Just offhand I'd think that the MoMA Architecture and Design collection contains some architectural records, unless those were mysteriously left out of our sample. >May I return to my suggestion that we should not try to canonicalize >Person records, but instead use them as we find them? It is certainly true that normalizing personal names is on the road to madness. As Mark pointed out earlier, there have been many past attempts to do this, called name authority control, and it's hard, expensive, requires human judgement, is subject to many exceptions, and that's why it's so bloody expensive. I would try to do it, myself. Don't forget that part of this project is to work with OCLC to use their brand new Web Service for name authority control. They even have a working prototype up that is part of the submission process and very nifty. It only fails if there are no matches in LCNAF or other national authority files, and we're thinking about how to fix that for local authors (typical in DSpace) and other name lists (like ULAN). They're very good at this, so let's let them do the work :-) MacKenzie/ MacKenzie Smith Associate Director for Technology MIT Libraries Building 14S-208 77 Massachusetts Avenue Cambridge, MA 02139 (617)253-8184 kenzie@mit.edu
Received on Wednesday, 15 October 2003 18:23:01 UTC