W3C home > Mailing lists > Public > public-lod@w3.org > March 2010

Re: Contd: Nice Data Cleansing Tool Demo

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Mon, 29 Mar 2010 17:22:46 -0400
Message-ID: <4BB11A26.6000801@openlinksw.com>
To: Georgi Kobilarov <georgi.kobilarov@gmx.de>
CC: public-lod@w3.org
Georgi Kobilarov wrote:
> Kingsley,
>
>   
>>>>> So by the time you can
>>>>> use Pivot on SW/linked data, you will already have solved all the
>>>>> interesting and challenging problems.
>>>>>
>>>>>           
>>>> This part is what I call an innovation slot since we have hooked it
>>>> into
>>>>
>>>>         
>>> our
>>>
>>>       
>>>> DBMS hosted faceted engine and successfully used it over very large
>>>>         
> data
>   
>>>> sets.
>>>>         
>>> Kingsley, I'm wondering: How did you do that? I tried it myself, and
>>> it doesn't work.
>>>       
>> Did I indicate that my demo instance was public? How did you come to
>> overlook that?
>>     
>
> I wasn't referring to a demo of yours, but to the general task of using
> Pivot as a frontend to a faceted browsing backend engine. 
>   
Re. the general task, it can compliment a back-end.

Have you ever encountered an old concept, from the tabular data 
representation realm (e.g., RDBMS) called "Mirrored Cursors" ? Maybe 
you've encountered "Detached Rowsets" and schemes that also include 
delta handling between the client and the server.

The fundamental point I am making to you is simply this: Pivot is a 
powerful compliment to an HTTP server that can deliver faceted 
navigation, natively (like Virtuoso). The end result is this: you can 
get the server the do some work (localize the first phase of the Faceted 
Search and Find against massive data corpus) and then have the client 
handle the remainder (nice Visual UX for insight discovery).
>
>   
>>> Pivot can't make use of server-side faceted browsing engines.
>>>
>>>       
>> Why do you speculate? You are incorrect and Virtuoso *doing* what you
>> claim is impossible will be emphatic proof, nice and simple.
>>
>> Pivot consumes data from HTTP accessible collections (which may be static
>>     
> or
>   
>> dynamic [1]). A dynamic collection is comprised of CXML resources
>>     
> (basically
>   
>> XML) .
>>     
>
> I don't speculate. Which parts of my "does not work" and "can't use" did
> sound like a speculation?  
>
>   
You explicitly said: "Pivot can't make use of server-side faceted 
browsing engines" .

I am saying, based on my earlier comments (clarified further above re. 
mirrored cursor anecdote): It can, will, and you shall see re. Virtuoso.

>  
>   
>>> You need to send *all* the data to the Pivot client, and it computes
>>> the facets and performs any filtering operation client-side.
>>>       
>> You make a collection from a huge corpus of data (what I demonstrate) then
>> you "Save As" (which I demonstrate as the generation point re. CXML
>> resource) and then Pivot consumes. All the data is Virtuoso hosted.
>>
>> There are two things you are overlooking:
>>
>> 1. The dynamic collection is produced at the conclusion of Virtuoso based
>> faceted navigation (the interactions basically describes the Facet
>> membership to Virtuoso) 2. Pivot works with static and dynamic collections
>>     
> .
>   
>> *I specifically state, this is about using both products together to solve
>>     
> a
>   
>> major problem. #1 Faceted Browsing UX #2 Faceting over a huge data
>> corpus.*
>>
>> Virtuoso is an HTTP server, it can serve a myriad of representations of
>>     
> data to
>   
>> user agents (it has its own DBMS hosted XSLT Processor and XML Schema
>> Validator with XQuery/XPath to boot, all very old stuff).
>>     
>
> Yes, you make a collection and "save as" that to CXML, exactly! That is not
> "using Pivot as a frontend to Virtuoso". 
I am starting from the Server not the Client.

I am starting from the Server because the Client can't handle the data 
corpus, and wasn't built with that in mind. It was build to consume a 
specific type of resource collection (static or dynamic) via HTTP end of 
story.

Where I start from doesn't invalidate Pivot as a front-end to Virtuoso, 
the entire operation can take place within the  "Pivot Browser" (Pivot 
is an HTTP user agent that operates on a specific data representation 
format).
> Sure, you can construct a small
> dataset from a huge dataset using SPARQL, or your Virtuoso facet engine or
> whatever. And then export that resulting dataset to Pivot collection XML and
> load that CXML into Pivot. 
I am not talking about "Export" in the manner you characterize. I am 
talking about an HTTP conversation that results in CXML based resource 
being dispatched from a Server to a User Agent, REST-fully.


> But that is very different to using Pivot as a
> frontend to a huge data set. 
>   
In your world view and eyes, maybe. Absolutely not the case in mine.

I can interact with Virtuoso from start to finish from within Pivot 
(never leaving Pivot). I start by making HTTP requests from Pivot, and 
the entire exercise concludes with an CXML representation of the 
collection assembled by Virtuoso (dynamically).

>
>   
>> BTW -- how do you think Peter Haase got his variant working? I am sure he
>> will shed identical light on the matter for you.
>>     
>
> Yes, Peter, please do. From what I saw in the Fluidops demo, it works
> exactly as I wrote above: A sparql-query constructs a small dataset from the
> sparql endpoint, converts that via a proxy to CXML and loads it into Pivot. 
>
> I don't say Pivot doesn't make a nice demo, or a useful tool to explore a
> small dataset via faceted filtering. But it's not a frontend that can be put
> on top of a faceted browsing engine like
> http://developer.nytimes.com/docs/article_search_api
>   
BTW - I am not new to pivoting, cubes, cursors, delta syncs between 
remote data sets. These are not Linked Data domain concepts, they are 
very old.

I suggest you wait for an actual release of Virtuoso and Dynamic CXML 
based collections before analyzing what's actually possible re. Pivot as 
a front-end to Virtuoso hosted data.

Kingsley

> Georgi
>
> --
> Georgi Kobilarov
> Uberblic Labs Berlin
> http://blog.georgikobilarov.com
>
>
>
>   


-- 

Regards,

Kingsley Idehen	      
President & CEO 
OpenLink Software     
Web: http://www.openlinksw.com
Weblog: http://www.openlinksw.com/blog/~kidehen
Twitter/Identi.ca: kidehen 
Received on Monday, 29 March 2010 21:23:16 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:25 UTC