Re: Using Entity Type and Properties to disambiguate search

My view is that in the Web of Data, there are different search issues, and they can get confused.

 1.  I have a string which describes something for me, and I want to find URIs of things that might be the entity of which I am thinking. From these, I may need to engage in a process of choosing the one(s) I meant.
 2.  I want to find out what the URI means, and I can do this be resolving it.
 3.  I have a URI, as from (1), and now I need to find where there is RDF about that URI.
 4.  I may want to look up synonyms.
 5.  I get the RDF from the URIs I have found, again by resolution or SPARQL queries, etc.

Things like Razorbase and okkam help with (1);
(2) is trivial, but is the real strength of the Web of Data;
Sindice is a great help in (3);
(4) can use similar sources as (1) again, but of course can use sameAs;
(5) can be straightforward, but also opens up a world of browsing etc..

If an end user thinks of "searching the semantic web", they expect to go through all that, and get back some text.
But I certainly see (1) and (3) and probably (4) as reasonably separate searching tasks.

One reason the functions can get confused is that during (1) you need to give the user some idea of the meanings of the URIs being suggested, so they can make an informed decision.
But I think that after the user has decided, it is good if it is a separate process of deciding what to do with the information.
Best
Hugh

On 05/06/2009 23:36, "Sherman Monroe" <sdmonroe@gmail.com> wrote:

[Apologies for multiple posts, I didn't realize a new thread had been started for this topic]

Chris, All

Try this:

- Go to razorbase.com <http://razorbase.com/>
- Enter 'kiwi'
- Click the magnifying glass icon on the button bar to view categories There, you should see all categories for things named Kiwi.
- Click the blue right arrow icon next to umble-sc:Birds, now you have Birds named Kiwi
- Now click the ID cards icon on the button bar, now you have all sameAs resouces for those birds

Here are other walkthroughs/examples <http://www.slideshare.net/tag/razorbase> .

These examples demonstrate that with linked data, guessed-ranking is obsolete to an extent, or is at least replaced by user/context specific filtering and sorting of results. The query can be ambiguous and the results heterogeneous and it's ok, because user is able to define a path to the desired sub-result set by applying filters for categories and property values of sets. In this way, transverse the linked data space is like finding a file on your computer, only, whereas file browsers only have a one-dimensional filter (i.e. two directions: parent and child directory), the linked data space has n-dimensional filters (with each dimension having two directions: to-subject and to-object).

In the mist of this great topic, I would like to again encourage you to have a good look at the Facets API work, as I truly believe this has the potential to become the standard approach for interacting with linked data bases.

Enjoy,
-sherman

On Thu, Jun 4, 2009 at 10:48 PM, Kingsley Idehen <kidehen@openlinksw.com> wrote:
Chris Wallace wrote:

[snip]

- not your problem but I note that search via Sindice is rather disappointing  - for example

http://sameas.org/html?q=Kiwi   (a general term for a flightless bird found in New Zealand divided into several species, but also a colloquial expression for a New Zealander and a Kiwifruit)

Chris,

I am curious about your experience re. pattern: Kiwi, when entered into the Full Text Search Field of the service at: http://lod.openlinksw.com . When I performed this test I had the option to use "Type" to filter the collection of Entities associated with the pattern. That single step enabled me to select "yago:BirdsOfNewZealand" and then simply ask for a display of associated Entities.

I am particulary interested in your searches re. "New Zealand" (past and present) as I am not convinced your initial experience with our service truly revealed what it offered (e.g., you could even seem to find much about the country: "New Zealand").
The essence of what we offer is the ability to  accurately "Find" descriptions of Entities using Type and Properties across a huge corpus of interlinked data.

Received on Saturday, 6 June 2009 19:18:43 UTC