Re: Yet Another LOD cloud browser

David Huynh wrote:
> Kingsley Idehen wrote:
>> David Huynh wrote:
>>> Sherman,
>>>
>>> Good to see more faceted browsing work on LOD!
>>>
>>> Will you be considering showing actual data or is showing the schema 
>>> your end goal? For example, I typed in "Microsoft", and I couldn't 
>>> seem to get any information about Microsoft. I only saw that there 
>>> are some other things related to Microsoft, e.g., people, but I 
>>> couldn't see what those things are (e.g., specifically, such as Bill 
>>> Gates). I'd like to show you what I mean but I can't seem to get a 
>>> permanent link to the state of the app.
>>>
>>> David
>>>
>>> Sherman Monroe wrote:
>>>> Hi All,
>>>>
>>>> Taking inspiration from Longwell[1] and Parallax[2], I present yet 
>>>> another linked data browser[3]. It uses the Virtuoso Facets Web 
>>>> service API [4] and runs against the public LOD cloud instance of 
>>>> Virtuoso [5]. I believe such faceted search UIs could be a nice 
>>>> compromise between SPARQL and a full-blown Cypher-based NL user 
>>>> interface[6].
>>>>
>>>> Feedback appreciated.
>>>>
>>>> Hints:
>>>>
>>>> - Click a breadcrumb at the top to navigate your query path
>>>> - Click "Your query" to view the filter details, click the nodes 
>>>> there to navigate the path, click the icons there to modify the filter
>>>> - Click the *green plus sign button* to add a filter
>>>> - Click the *blue undo button* to unbound a node value
>>>>
>>>> Notes:
>>>>
>>>> I was amazed in the many instances where I got better results from 
>>>> LOD dataspace than from Google/Technorati/Wikipedia. For example, 
>>>> searching Monopoly, then filtering to the 
>>>> /umbel-sc:MentalSituations/ category gave me a nice (and in some 
>>>> cases humorous) list of Monopoly knock-offs. I tried finding such a 
>>>> list on the WWW with no luck 
>>>> <http://www.google.com/search?q=Monopoly%20knockoffs>. Kingsley 
>>>> tells me that Entity Rank [4] has to do with this, but I wonder 
>>>> whether this quality will stick as the cloud increases.
>>>>
>>>>
>>>> References:
>>>> [1] http://simile.mit.edu/wiki/Longwell
>>>> [2] http://mqlx.com/~david/parallax/ 
>>>> <http://mqlx.com/%7Edavid/parallax/>
>>>> [3] http://ec2.monrai.com:8890/facets
>>>> [4] http://lod.openlinksw.com/fct/facet_doc.html
>>>> [5] http://lod.openlinksw.com
>>>> [6] http://cypher.monrai.com
>>>>
>>>> Enjoy,
>>>>
>>>> -- 
>>>>
>>>> Thanks,
>>>> -sherman
>>>>
>>>> I pray that you may prosper in all things and be healthy, even as 
>>>> your soul prospers
>>>> (3 John 1:2)
>>>
>>>
>>>
>> David,
>>
>> Sherman's razrobase browser is using the REST API provided by the 
>> Virtuoso instance at: <http://lod.openlinksw.com>, all you have to do 
>> re. "Microsoft" is go there and type in pattern: Microsoft .
>>
>> Once you do that, you will see an initial Entity Ranked page of 
>> associated Entity URIs, text excerpts from literal values plus bars 
>> intidicating Entity Rank and Text pattern match frequencies that 
>> drive the ordering.  At this stage, assuming you don't already see 
>> the Entity URI for "Microsoft) you can pivot on: Type 
>> (razorbase:Category) or Poperties (razorbase:Information) to narrow 
>> the focus of your quest.
>>
>> The difference right now is that Sherman hasn't implemented the 
>> critical "retry" and "timeout" functionality that lie at the core of 
>> this system,  once  this is implemented our basic UI and his should 
>> produce similar results with the only variance that our UI resides 
>> inside Virtuoso while Razorbase is making RESTful calls from the 
>> outside :-)
>>
>> I hope this clarifies things.
>>
> Thanks for the clarification, Kingsley.
>
> So, I typed in "Microsoft" and got to
>    http://lod.openlinksw.com/fct/facet.vsp?cmd=text&sid=60306
> Which doesn't look like a permanent link for referring to the query I 
> just typed. Copying and pasting that URL to a different browser yields
>
>    An unexpected error was encountered while processing your request.
>    Diagnostics
>
>    SQLSTATE: 22023
>
>    SQLMSG  : SR016: Function length needs a string or array as its 
> argument,
>    not an argument of type 189 (= INTEGER)
>
> But anyhow, it's not a big deal.
There is a permalink feature via the "Save" query link (could be 
situated better, but its there).
>
> My next question is, how does a person know which one among those 
> results is *the* Microsoft? 
Great question.

Dissambiguation by Entity Type and/or Properties is one of the essential 
parts of this system.  Just click on "Type" to switch focus to filtered 
entites by "Type". You can also take short cuts with "Properties" or 
specifically those with Distinct Value Counts or Values etc..

> There are half a dozen dbpedia:Microsoft, so which one is the one? 
> And if they are all the same thing, why are there multiple copies? Is 
> it a glitch in the browsing engine or a glitch in the data? Also, I 
> suspect that a random user might not be aware of dbpedia at this time, 
> so perhaps "dbpedia:Microsoft" might not sound like *the* Microsoft, 
> at least without any further detail. Maybe one copy of  "Microsoft 
> Corporation (source: Wikipedia, data governance: Dbpedia)" might sound 
> more comprehensible? Better yet, an image of Microsoft logo next to 
> that would make things crystal clear. Even better would be to see the 
> description, "Microsoft Corporation is an American multinational 
> computer technology corporation, which rose to dominate the home 
> computer operating system market with MS-DOS in the mid-1980s, 
> followed by the Windows line of operating systems." below it.
There are half a dozen Entities across N graphs in the Quad Store. The 
UI issue here is that we don't show the source Graphs in the results 
page. Reason, we know we can actually provide distinct results cheaper 
than listing the Graph Names etc..

Your timing is borderline impeccable, we will actually be releasing the 
Distinct optimization that showcases what I mean. Anyway, for now, when 
you select one of the Microsofts from DBpedia graphs, click on the 
"Stats" link, it will give you a back door view of where the data has 
come from.

As always, enjoy chatting with you :-)

Kingsley
>
> David
>
>
>


-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com

Received on Friday, 15 May 2009 18:36:46 UTC