data integration, data sets, flamingo

Hi team,

While searching for a newer version of graphviz this morning, due to some
problem with DNS, I came across some interesting presentations by T Dasu at
ATT / Bell Labs about data quality and data integration:

@misc{dasu1,
	title="Domain expertise and metadata", 
	author"T. Dasu and T. Johnson",
	howpublished="\url{http://192.20.225.10/topics/SDM_metadata.ppt}"}

@misc{dasu2,
	title="Database solutions to data quality", 
	author="T. Dasu and T. Johnson",
	howpublished="\url{http://192.20.225.10/topics/SDM_DB.ppt}"} 

I'm working through the references in these presentations at the moment,
trying where possible to locate web versions of the citations. When I've
finished this, I'll send them to the list.

However while doing that I came across this project
http://www-db.ics.uci.edu/pages/flamingo/
"The FLAMINGO PROJECT: CLEANSING DATA TO IMPROVE INFORMATION QUALITY
Database Group, UC Irvine"

Interestingly they have some data sets, not directly related to SIMILE, but
it might be interesting to make them available as RDF (assuming they are not
available already - perhaps Eric can comment on this) see
http://www-db.ics.uci.edu/pages/flamingo/Dataset.htm

Specifically it would be interesting to investigate how hard it is to merge
the two movie related databases. 

Dr Mark H. Butler
Research Scientist                HP Labs Bristol
mark-h_butler@hp.com
Internet: http://www-uk.hpl.hp.com/people/marbut/

Received on Friday, 7 November 2003 09:49:44 UTC