- From: Daniel Schwabe <dschwabe@inf.puc-rio.br>
- Date: Tue, 02 Jun 2009 12:33:41 -0300
- To: Sherman Monroe <sdmonroe@gmail.com>
- CC: Kingsley Idehen <kidehen@openlinksw.com>, Samur Araujo <samuraraujo@gmail.com>, Linked Data community <public-lod@w3.org>, semantic-web@w3.org
Sherman Monroe wrote: > Daniel, > > I see some interesting concepts worth exploring here, e.g. using > windows (with paging inside the window). But as I refine my query, > there isn't any apparent context that orients me in the data. E.g. how > does one box/set relate to the others. The dependency between the boxes is recorded, but it is not a simple matter to actually expose it in simple way to the user. Each box (set) is really dependent on a chain of previous operations, so in general it may be a very long list of function compositions. I think the biggest contribution is not so much the interface aspects that you refer to, but the way you can form the various sets (boxes) through various operations - the SPO, which allows you to do arbitrary matches for <s,p,o> triples, plus union/intersection/difference, plus de-referencing, plus faceted interface on either an arbitrary set of chosen properties (and applied to any set) or automatically generated facets. Here is a simple interesting scenario: Find a drug for hypoglycemia that can be prescribed to a known alcohol abuser. Click on menu->repositories, add drugbank sparql endpoint (http://www4.wiwiss.fu-berlin.de/drugbank/sparql) limit 50 (sometimes we've been getting timeouts; just try again and eventually it works. We have a locally loaded version of these repositories, but we haven't finished building the index for the full text search yet, still figuring how to build this index it in Virtuoso). search for hypoglycemic (call it Set A) search for avoid alcohol (call it Set B) click on A, clic on the intersection symbol, click on set B, click on "=". (call it set C). Click on A, click on S, click on "-". You've computed the set of drugs associated with hypoglycemic, intersected with the set of drugs which should not be taken with alcohol, and computed the difference between this set and the set of drugs associated with hypoglycemic, resulting in such drugs that may be taken with alcohol. If you sophisticate the scenario a bit, you can repeat the same reasoning for "antidepressant", to get the set of drugs which are antidepressants and may be taken with alcohol. Sophisticating further (but here I don't have the medical knowledge to formulate it properly), I could try to determine which diabetes and antidepressant drugs could be prescribed together (I'd need to determine dangerous interactions between candidates obtained in the previous steps). and so on... > > I notice you're using Sesame, do you think it can scale? I tried > selecting several repositories at once, but the system seems to hang > awhile (couple of minutes) before returning results. We use both Sesame (through its Java interface) and Virtuoso (regular http SPARQL interface), depending on the size of the datased (e.g., dbpedia is on Virtuoso). You may have also realized you can add any arbitrary external endpoint as well. The problems you report are not really due to Explorator, but rather from the engines themselves, and the particular repositories. If you try to issue the same queries (notice there are many queries necessary to present the information in the form it appears on the screen), you will see they also take a while to respond. In fact, we'd be very interested in seeing how to optimize such queries. Samur, my former student, will elaborate this in a separate message, for those interested. (we might take this offline if it becomes too specific, although I feel the problems we face are the same anyone who wishes to build "user friendly" interfaces to RDF data would face...) Cheers D
Received on Tuesday, 2 June 2009 15:34:20 UTC