W3C home > Mailing lists > Public > public-lod@w3.org > March 2010

Re: Contd: Nice Data Cleansing Tool Demo

From: Aldo Bucchi <aldo.bucchi@gmail.com>
Date: Mon, 29 Mar 2010 15:38:02 -0400
Message-ID: <7a4ebe1d1003291238q36cf4cefw8a7dd87344cbeb65@mail.gmail.com>
To: nathan@webr3.org
Cc: Georgi Kobilarov <georgi.kobilarov@gmx.de>, Kingsley Idehen <kidehen@openlinksw.com>, public-lod@w3.org

On Mon, Mar 29, 2010 at 3:22 PM, Nathan <nathan@webr3.org> wrote:
> Georgi Kobilarov wrote:
>> Kingsley,
>>>>>> So by the time you can
>>>>>> use Pivot on SW/linked data, you will already have solved all the
>>>>>> interesting and challenging problems.
>>>>> This part is what I call an innovation slot since we have hooked it
>>>>> into
>>>> our
>>>>> DBMS hosted faceted engine and successfully used it over very large
>> data
>>>>> sets.
>>>> Kingsley, I'm wondering: How did you do that? I tried it myself, and
>>>> it doesn't work.
>>> Did I indicate that my demo instance was public? How did you come to
>>> overlook that?
>> I wasn't referring to a demo of yours, but to the general task of using
>> Pivot as a frontend to a faceted browsing backend engine.
>>>> Pivot can't make use of server-side faceted browsing engines.
>>> Why do you speculate? You are incorrect and Virtuoso *doing* what you
>>> claim is impossible will be emphatic proof, nice and simple.
>>> Pivot consumes data from HTTP accessible collections (which may be static
>> or
>>> dynamic [1]). A dynamic collection is comprised of CXML resources
>> (basically
>>> XML) .
>> I don't speculate. Which parts of my "does not work" and "can't use" did
>> sound like a speculation?
>>>> You need to send *all* the data to the Pivot client, and it computes
>>>> the facets and performs any filtering operation client-side.
>>> You make a collection from a huge corpus of data (what I demonstrate) then
>>> you "Save As" (which I demonstrate as the generation point re. CXML
>>> resource) and then Pivot consumes. All the data is Virtuoso hosted.
>>> There are two things you are overlooking:
>>> 1. The dynamic collection is produced at the conclusion of Virtuoso based
>>> faceted navigation (the interactions basically describes the Facet
>>> membership to Virtuoso) 2. Pivot works with static and dynamic collections
>> .
>>> *I specifically state, this is about using both products together to solve
>> a
>>> major problem. #1 Faceted Browsing UX #2 Faceting over a huge data
>>> corpus.*
>>> Virtuoso is an HTTP server, it can serve a myriad of representations of
>> data to
>>> user agents (it has its own DBMS hosted XSLT Processor and XML Schema
>>> Validator with XQuery/XPath to boot, all very old stuff).
>> Yes, you make a collection and "save as" that to CXML, exactly! That is not
>> "using Pivot as a frontend to Virtuoso". Sure, you can construct a small
>> dataset from a huge dataset using SPARQL, or your Virtuoso facet engine or
>> whatever. And then export that resulting dataset to Pivot collection XML and
>> load that CXML into Pivot. But that is very different to using Pivot as a
>> frontend to a huge data set.
>>> BTW -- how do you think Peter Haase got his variant working? I am sure he
>>> will shed identical light on the matter for you.
>> Yes, Peter, please do. From what I saw in the Fluidops demo, it works
>> exactly as I wrote above: A sparql-query constructs a small dataset from the
>> sparql endpoint, converts that via a proxy to CXML and loads it into Pivot.
>> I don't say Pivot doesn't make a nice demo, or a useful tool to explore a
>> small dataset via faceted filtering. But it's not a frontend that can be put
>> on top of a faceted browsing engine like
>> http://developer.nytimes.com/docs/article_search_api
> The last thing I want is an argument about this; but surely virtually
> every service in the world; faceted browsing included, works by querying
> a large dataset to get a smaller set of results, transforming it in to a
> the needed format an then displaying? sounds like every system I've ever
> seen from the simple html view of an sql query right up to the mighty
> google itself.
> Maybe I'm being naive here; what am I missing?


You're not missing much. From what I see:
Georgi's point is that the level of integration is not ideal. It is
basically a "load" style integration, not a "connect" style
Kingsley's point is that they "can" be integrated, and he has a demo
to prove it.

Both are right ;)

I can relate to both but I lean towards Kingsley's because he is, as
usual, projecting. He knows that this integration is enough to make a
point, and that the rest will happen.
Show the value! The architecture will follow. ( this is what M$ does
all the time ). Plus they already have a lock-in on the runtime side
and seadragon tech, so I think they can afford to open the platform up
some more on the integration side of things.


> Many Regards,
> Nathan

Aldo Bucchi

This message is only for the use of the individual or entity to which it is
addressed and may contain information that is privileged and confidential. If
you are not the intended recipient, please do not distribute or copy this
communication, by e-mail or otherwise. Instead, please notify us immediately by
return e-mail.
Received on Monday, 29 March 2010 19:38:35 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:16:04 UTC