W3C home > Mailing lists > Public > www-rdf-dspace@w3.org > October 2003

Name canonicalization

From: Kevin Smathers <kevin.smathers@hp.com>
Date: Thu, 30 Oct 2003 15:23:33 -0800
Message-ID: <3FA19D75.20601@hp.com>
To: SIMILE public list <www-rdf-dspace@w3.org>

Hi all,

I've been working on name canonicalization in the OCW repository, and 
wrote a script today that handles all of the names that are found there 
(either by pattern, or by using a short list of exceptions), and am 
starting to implement a web service interface to ask the OCLC for a URI 
for each name. 

Given the work I've already done on the OCLC web service, the results of 
the request are going to be a set of possible matches including
phrase matches and word matches.  I'm thinking of turning these into a 
graph that looks like this for some IMS record <R>, some person <V>, and 
a set of OCLC results <A1>, <B1>...<B2>.

<A> dc:title "Something"
    loc-life:author [
          <V> vc:FN "Smith, John";
              oclc:probablyMatches <A1>;
              oclc:possiblyMatches <B1>;
              oclc:possiblyMatches <B2>
     ] .
                     
 
Does this seem like a reasonable model for this relationship?   How do 
we match this up with the ArtStor records -- I imagine they should be 
matched by URI, with a similar linkage from each ArtStor record.

Cheers,
-kls          

-- 
========================================================
   Kevin Smathers                kevin.smathers@hp.com    
   Hewlett-Packard               kevin@ank.com            
   Palo Alto Research Lab                                 
   1501 Page Mill Rd.            650-857-4477 work        
   M/S 1135                      650-852-8186 fax         
   Palo Alto, CA 94304           510-247-1031 home        
========================================================
use "Standard::Disclaimer";
carp("This message was printed on 100% recycled bits.");
Received on Thursday, 30 October 2003 18:25:26 EST

This archive was generated by hypermail pre-2.1.9 : Thursday, 30 October 2003 18:25:30 EST