Re: Contd: Nice Data Cleansing Tool Demo from Nathan on 2010-03-29 (public-lod@w3.org from March 2010)

From: Nathan <nathan@webr3.org>
Date: Mon, 29 Mar 2010 20:22:43 +0100
To: Georgi Kobilarov <georgi.kobilarov@gmx.de>
CC: 'Kingsley Idehen' <kidehen@openlinksw.com>, public-lod@w3.org
Message-ID: <4BB0FE03.5050004@webr3.org>
Georgi Kobilarov wrote:
> Kingsley,
> 
>>>>> So by the time you can
>>>>> use Pivot on SW/linked data, you will already have solved all the
>>>>> interesting and challenging problems.
>>>>>
>>>> This part is what I call an innovation slot since we have hooked it
>>>> into
>>>>
>>> our
>>>
>>>> DBMS hosted faceted engine and successfully used it over very large
> data
>>>> sets.
>>> Kingsley, I'm wondering: How did you do that? I tried it myself, and
>>> it doesn't work.
>> Did I indicate that my demo instance was public? How did you come to
>> overlook that?
> 
> I wasn't referring to a demo of yours, but to the general task of using
> Pivot as a frontend to a faceted browsing backend engine. 
> 
> 
>>> Pivot can't make use of server-side faceted browsing engines.
>>>
>> Why do you speculate? You are incorrect and Virtuoso *doing* what you
>> claim is impossible will be emphatic proof, nice and simple.
>>
>> Pivot consumes data from HTTP accessible collections (which may be static
> or
>> dynamic [1]). A dynamic collection is comprised of CXML resources
> (basically
>> XML) .
> 
> I don't speculate. Which parts of my "does not work" and "can't use" did
> sound like a speculation?  
> 
>  
>>> You need to send *all* the data to the Pivot client, and it computes
>>> the facets and performs any filtering operation client-side.
>> You make a collection from a huge corpus of data (what I demonstrate) then
>> you "Save As" (which I demonstrate as the generation point re. CXML
>> resource) and then Pivot consumes. All the data is Virtuoso hosted.
>>
>> There are two things you are overlooking:
>>
>> 1. The dynamic collection is produced at the conclusion of Virtuoso based
>> faceted navigation (the interactions basically describes the Facet
>> membership to Virtuoso) 2. Pivot works with static and dynamic collections
> .
>> *I specifically state, this is about using both products together to solve
> a
>> major problem. #1 Faceted Browsing UX #2 Faceting over a huge data
>> corpus.*
>>
>> Virtuoso is an HTTP server, it can serve a myriad of representations of
> data to
>> user agents (it has its own DBMS hosted XSLT Processor and XML Schema
>> Validator with XQuery/XPath to boot, all very old stuff).
> 
> Yes, you make a collection and "save as" that to CXML, exactly! That is not
> "using Pivot as a frontend to Virtuoso". Sure, you can construct a small
> dataset from a huge dataset using SPARQL, or your Virtuoso facet engine or
> whatever. And then export that resulting dataset to Pivot collection XML and
> load that CXML into Pivot. But that is very different to using Pivot as a
> frontend to a huge data set. 
> 
> 
>> BTW -- how do you think Peter Haase got his variant working? I am sure he
>> will shed identical light on the matter for you.
> 
> Yes, Peter, please do. From what I saw in the Fluidops demo, it works
> exactly as I wrote above: A sparql-query constructs a small dataset from the
> sparql endpoint, converts that via a proxy to CXML and loads it into Pivot. 
> 
> I don't say Pivot doesn't make a nice demo, or a useful tool to explore a
> small dataset via faceted filtering. But it's not a frontend that can be put
> on top of a faceted browsing engine like
> http://developer.nytimes.com/docs/article_search_api
> 

The last thing I want is an argument about this; but surely virtually
every service in the world; faceted browsing included, works by querying
a large dataset to get a smaller set of results, transforming it in to a
the needed format an then displaying? sounds like every system I've ever
seen from the simple html view of an sql query right up to the mighty
google itself.

Maybe I'm being naive here; what am I missing?

Many Regards,

Nathan
Received on Monday, 29 March 2010 19:23:27 UTC