- From: Butler, Mark <Mark_Butler@hplb.hpl.hp.com>
- Date: Thu, 23 Oct 2003 11:18:21 +0100
- To: "'www-rdf-dspace@w3.org'" <www-rdf-dspace@w3.org>
Hi Kevin > If we are going to try to collapse all versions of Person to a single > Person reference then presumably there also must be a way to collapse > the individual records that come from converting our XML sources to > RDF. Arguably then my XSLT script should first look up a > person in the > common database before creating a new person so that the > Person records > aren't massively duplicated. There are two levels to this: 1. removing duplicates within the collection. 2. removing duplicates between collections. In the XSLT script I've created for Artstor, it creates URIs for people from their Artstor identifier. One of the nice sideffects of using RDF is because different instances of the same person have the same URI the duplicates of type (1) are removed if we de-serialize and serialize the model e.g. convert it to N3. However removing duplicates of type (2) is more complicated, because generally the way we construct the unique URIs will vary between collections. > Anyone have recommendations on > how to set > up a global table of Person records to reference in XSLT? Or > perhaps it > would be easier to put in a temporary reference using XSLT > and replace > that with a global reference using a bit of Perl? I wouldn't attempt to use XSLT. Removing duplicates of type (2) is an important part of mapping between collections, so I think it needs to be done in RDF using semantic web tools as that's a key area SIMILE is investigating? kind regards Dr Mark H. Butler Research Scientist HP Labs Bristol mark-h_butler@hp.com Internet: http://www-uk.hpl.hp.com/people/marbut/
Received on Thursday, 23 October 2003 06:19:34 UTC