- From: David Huynh <dfhuynh@alum.mit.edu>
- Date: Tue, 30 Mar 2010 07:18:04 +0900
- To: Aldo Bucchi <aldo.bucchi@gmail.com>
- CC: Kingsley Idehen <kidehen@openlinksw.com>, "public-lod@w3.org" <public-lod@w3.org>
Hi Aldo, On Mar/30/10 1:46 am, Aldo Bucchi wrote: > Hi David, > > I love it and I NEED it ;) > Awesome work, really. > > I heard it will be opensource so I will probably be able to extend it > myself, Yup, it'll be open source. Clean data sets are all clean the same way, but each dirty data set is dirty in its own way. Which is why Gridworks needs all the open source contributions in order to cover as many different kinds of data dirtiness as possible. :-) > but here are some ideas for (missing?) features: > * Importing custom Lookups/Dictionaries ( to go from text to IDs or > the other way around ). Maybe this is possible using a different hook > for the reconciliation mechanism. > * Related: Plug in other reconciliation services ( not sure how this > stands up to freebase biz alignment ) > Definitely. Right now Gridworks is hooked up to 2 services: the Freebase text search service (called "relevance") and the experimental proper reconciliation service. It makes sense to be able to plug in other services as well. > * Command line engine. To add a GW project as a step in a traditional > transformation job and execute steps sequentially. > We've thought of that, too, but haven't implemented it. That shouldn't be too hard. > * Expose Gazetteers ( dictionaries ) generated within the tool ( when > equating facets ) > That makes sense. I'll think more about how to support that. David
Received on Monday, 29 March 2010 22:18:34 UTC