- From: Leo Sauermann <leo@gnowsis.com>
- Date: Sat, 10 Jan 2004 10:57:35 +0100
- To: "'Bob MacGregor'" <macgregor@ISI.EDU>, <jena-dev@yahoogroups.com>, <www-rdf-interest@w3.org>
- Message-ID: <000701c3d760$2d8ea060$0501a8c0@ZION>
Andy is right with the Jena MultiUnion although i never used it and programmed the same thing myself (I thought that Jena has to have something like it but didn't find it because the class is not good documented and I couldn't guess the name right when I looked for it - Or is it a new feature in Jena 2.0?) know from my experience I know that this works very fine. You don't have to union the models by adding all data into a big model, you just have to pass the query to all submodels and gather all result triples in a single result. I have provided a "optimized" solution (what andy talks about) for this problem in my gnowsis framework for Personal Information Management. The center of my work is a "data hub" that is a jena model. the DataHub hides a lot of other data sources inside. It answers queries by forwarding them to the data inside. I integrated Jena Models as datasources and also Adapters. By Adapters i have made available the filesystem data as RDF graph and Microsoft Outlook as RDF graph, also MP3 file information. There you come to the "optimization" problem based on application knowledge: To decide which adapter has to be contacted with a query (drilled down to a triplematch find()), I have a "Registry" of adapters. Each adapter is registered to handle certain Resource URLs. F.e. I can scan a file url: file://leo.gnowsis.com/multimedia/mp3/pop/u2-one.mp3 and see that it is of scheme "file", on "localhost", in the file-share "multimedia" and has ending "mp3". I have registered a "FileAdapter" that can now create triples about this resource (<file> <hasfilesize> <filesize>). There are models that are always queried, without regard to the url. these models are "integrating" models like a PIM or a database. It is good to store bidirectional links in central stores, these stores are then always queried. So the query goes not to all datasources (there will be nothing about the file url to be known in Microsoft Outlook) but to a selection. This approach has some flaws, that I am aware of but it is better than nothing or the plain MultiUnion. Another good reason for the URL-parsing and source determination approach, and this one is serious, is that all webservers in the world work that way and the WWW has proven to work, so this approach scales very good. think of an apache that checks out the ".php" endings on files to query mod_php. Mind that this approach is based on my philosophy of "everything needs to be identified by an URL" NOT an URI. The approach is published in my thesis at http://www.gnowsis.com/thesis/GnowsisThesis.zip There will be an open-source alpha-alpha release of gnowsis at the end of January (probably 30th :-) greetings Leo Sauermann, CTO www.gnowsis.com -----Original Message----- From: www-rdf-interest-request@w3.org [mailto:www-rdf-interest-request@w3.org] On Behalf Of Bob MacGregor Sent: Friday, January 09, 2004 6:45 PM To: jena-dev@yahoogroups.com; www-rdf-interest@w3.org Subject: Query across multiple models <offshoot of a tiny Jena thread> The fact that we can't query against a set of models reflects a problem with RDF that can be expected to grow much more serious over time. Basically, anyone who is doing serious RDF work is going to have a lot of models, and most queries will run against multiple models. What's needed to begin with is a formal means for defining the equivalent of (dynamic) "union" models. The current Jena union is static (I think). If U represents the union of A and B, and updates are made to A or B, those updates are not reflected in U. This is not acceptable in the long run. There are, of course, many ways to define union. The "import ontology" construct in OWL is a somewhat badly-conceived means for doing so. So adding this is something that would probably take a committee quite a while to hash out. Right now, there is a lot of bookkeeping that has been thrust on users because the boundaries of the RDF spec cut off prematurely. RDF has nothing today resembling union, and it should have. Until it does, RDF-based applications will not scale gracefully to multiple model situations. Cheers, Bob At 01:54 PM 1/8/2004, Seaborne, Andy wrote: Andreas, You can create a union model and ask a query of that model: this will ask the query over both of the submodels. There is no optimization done by the union model: because it does not know in which model properties might be, it ask each model for each part of the query pattern. If both the models are in the same database, then, currently, this would not be optimized to pass a single query down that will work against both models. It could be but as there are different possibilities for database layout, the number of possible optimizations schemes grows. Such optimizations are possible with the architecture, they just aren't implemented. You could write a specialization of the union model that optimized queries base on application-specific knowledge of where information is to be found. Andy > ----Original Message---- > From: andreasharth <mailto:andreasharth@yahoo.com> > Date: 8 January 2004 19:17 > > Hi, > > I want to have a (RDQL) query that spans two or more models. For > example, get dc:title for <http://example.org/sample> from modelA and > get dc:description for the same URI from modelB. The query I have in > mind is more complicated, but that's basically the functionality I am > looking for. > > What would be the best way to query across multiple models in Jena2? > > Thanks, > Andreas. > > > > > Yahoo! Groups Links > > To visit your group on the web, go to: > http://groups.yahoo.com/group/jena-dev/ > > To unsubscribe from this group, send an email to: > jena-dev-unsubscribe@yahoogroups.com > > Your use of Yahoo! Groups is subject to: > http://docs.yahoo.com/info/terms/ _____ Yahoo! Groups Links * To visit your group on the web, go to: * http://groups.yahoo.com/group/jena-dev/ * * To unsubscribe from this group, send an email to: * jena-dev-unsubscribe@yahoogroups.com <mailto:jena-dev-unsubscribe@yahoogroups.com?subject=Unsubscribe> * * Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service <http://docs.yahoo.com/info/terms/> . ===================================== Robert MacGregor Senior Project Leader macgregor@isi.edu Phone: 310/448-8423, Fax: 310/822-6592 Mobile: 310/251-8488 USC Information Sciences Institute 4676 Admiralty Way, Marina del Rey, CA 90292 =====================================
Received on Saturday, 10 January 2004 04:55:18 UTC