Re: Nice Data Cleansing Tool Demo

On Mar/29/10 10:01 am, Kingsley Idehen wrote:
> David Huynh wrote:
>> On Mar/29/10 12:31 am, Kingsley Idehen wrote:
>>> All,
>>>
>>> A very nice data cleansing tool from David and Co. at Freebase.
>>>
>>> CSVs are clearly the dominant data format in the structured open 
>>> data realm. This tool deals with ETL very well. Of course, for those 
>>> who appreciate OWL, a lot of what's demonstrated in this demo is 
>>> also achievable via "context rules". Bottom line (imho), nice tool 
>>> that will only aid improving Web of Linked Data quality at the data 
>>> set production stage.
>>>
>>> Links:
>>>
>>> 1. http://vimeo.com/10081183 -- Freebase Gridworks
>>>
>> Thanks, Kingsley. The second screencast, by Stefano Mazzocchi, also 
>> demonstrates a few other interesting features:
>>
>>     http://www.vimeo.com/10287824
>>
>> David
> David,
>
> Yes, very nice!
>
> Now here is the obvious question, re. broader realm of faceted data 
> navigation, have you guys digested the underlying concepts 
> demonstrated by Microsoft Pivot?
>

I've seen the TED talk on Pivot. It's a very well polished 
implementation of faceted browsing. The Seadragon technology integration 
and animations are well executed. As far as "underlying concepts" in 
faceted browsing go, I haven't noticed anything novel there.

One thing to note: in each Pivot demo example, there is data of exactly 
one type only--say, type people. So it seems, using Microsoft Pivot, you 
can't pivot from one type to another, say, from people to their 
companies. You can't do that example I used for Parallax: US presidents 
-> children -> schools. Or skyscrapers -> architects -> other buildings. 
So from what I've seen, as it currently is, Microsoft Pivot cannot be 
used for browsing graphs because it cannot pivot (over graph links).

Furthermore, I believe that to get Pivot to perform well, you need a 
cleaned up, *homogeneous* data set, presumably of small size (see their 
Wikipedia example in which they picked only the top 500 most visited 
articles). SW/linked data in their natural habitat, however, is rarely 
that cleaned up and homogeneous ... So by the time you can use Pivot on 
SW/linked data, you will already have solved all the interesting and 
challenging problems.

I do applaud their recent offering of the Pivot widget for embedding 
into any arbitrary site. That should make faceted browsing more 
accessible to web authors, as Exhibit has done. Pivot is way more 
polished and hopefully scales better than Exhibit, although Exhibit is 
more malleable as a piece of software.

David

Received on Monday, 29 March 2010 02:01:28 UTC