- From: Nick Matsakis <matsakis@mit.edu>
- Date: Tue, 18 Nov 2003 15:10:39 -0500 (EST)
- To: "Butler, Mark" <Mark_Butler@hplb.hpl.hp.com>
- Cc: www-rdf-dspace@w3.org
On Fri, 7 Nov 2003, Butler, Mark wrote: > Interestingly [flamingo has] some data sets, not directly related to > SIMILE, but it might be interesting to make them available as RDF ... > Specifically it would be interesting to investigate how hard it is to > merge the two movie related databases. I took a look at the Flamingo datasets. The imdb "dataset" is just a flat file of about 54,000 text names, one per line. There appear to be duplicate entries in the file, but this is of basically no use because the duplicates aren't labelled. How can you tell how well you are doing if you don't have a gold standard to match against? The other movie dataset is more interesting because it has a relational structure, but I'm not really sure what is in it because it appears to be a microsoft access database. Anyway, thanks for the pointer to that project, and the other references you sent a while back. They are very helpful to me. Nick Matsakis
Received on Tuesday, 18 November 2003 15:12:03 UTC