W3C home > Mailing lists > Public > www-rdf-interest@w3.org > January 2004

RE: Query across multiple models

From: Leo Sauermann <leo@gnowsis.com>
Date: Sat, 10 Jan 2004 10:57:35 +0100
To: "'Bob MacGregor'" <macgregor@ISI.EDU>, <jena-dev@yahoogroups.com>, <www-rdf-interest@w3.org>
Message-ID: <000701c3d760$2d8ea060$0501a8c0@ZION>
Andy is right with the Jena MultiUnion
 
although i never used it and programmed the same thing myself (I thought
that Jena has to have something like it but didn't find it because the
class is not good documented and I couldn't guess the name right when I
looked for it - Or is it a new feature in Jena 2.0?)
 
know from my experience I know that this works very fine.
You don't have to union the models by adding all data into a big model, 
you just have to pass the query to all submodels and gather all result
triples in a single result.
 
 
I have provided a "optimized" solution (what andy talks about) for this
problem in my gnowsis framework for Personal Information Management. 
 
The center of my work is a "data hub" that is a jena model. the DataHub
hides a lot of other data sources inside. It answers queries by
forwarding them to the data inside. I integrated Jena Models as
datasources and also Adapters. By Adapters i have made available the
filesystem data as RDF graph and Microsoft Outlook as RDF graph, also
MP3 file information.
There you come to the "optimization" problem based on application
knowledge: 
To decide which adapter has to be contacted with a query (drilled down
to a triplematch find()), I have a "Registry" of adapters. Each adapter
is registered to handle certain Resource URLs. 
F.e. I can scan a file url:
file://leo.gnowsis.com/multimedia/mp3/pop/u2-one.mp3
and see that it is of scheme "file", on "localhost", in the file-share
"multimedia" and has ending "mp3".
I have registered a "FileAdapter" that can now create triples about this
resource (<file> <hasfilesize> <filesize>).
 
There are models that are always queried, without regard to the url.
these models are "integrating" models like a PIM or a database. It is
good to store bidirectional links in central stores, these stores are
then always queried.
So the query goes not to all datasources (there will be nothing about
the file url to be known in Microsoft Outlook) but to a selection.
This approach has some flaws, that I am aware of but it is better than
nothing or the plain MultiUnion.
 
Another good reason for the URL-parsing and source determination
approach, and this one is serious, is that all webservers in the world
work that way and the WWW has proven to work, so this approach scales
very good. think of an apache that checks out the ".php" endings on
files to query mod_php. 
 
Mind that this approach is based on my philosophy of "everything needs
to be identified by an URL" 
NOT an URI.
 
The approach is published in my thesis at 
http://www.gnowsis.com/thesis/GnowsisThesis.zip
 
There will be an open-source alpha-alpha release of gnowsis at the end
of January (probably 30th :-)
 
greetings 
Leo Sauermann, CTO
www.gnowsis.com

-----Original Message-----
From: www-rdf-interest-request@w3.org
[mailto:www-rdf-interest-request@w3.org] On Behalf Of Bob MacGregor
Sent: Friday, January 09, 2004 6:45 PM
To: jena-dev@yahoogroups.com; www-rdf-interest@w3.org
Subject: Query across multiple models


<offshoot of a tiny Jena thread>

The fact that we can't query against a set of models reflects a problem
with RDF that can be expected to grow much more serious over time.
Basically, anyone who is doing serious RDF work is going to have
a lot of models, and most queries will run against multiple models.

What's needed to begin with is a formal means for defining the
equivalent of 
(dynamic) "union"
models.  The current Jena union is static (I think).  If   U represents
the union
of A and B, and updates are made to A or B, those updates are not
reflected
in U.  This is not acceptable in the long run.  

There are, of course, many ways to define union.  The "import ontology"
construct
in OWL is a somewhat badly-conceived means for doing so.  So adding this
is something that would probably take a committee quite a while to hash
out.

Right now, there is a lot of bookkeeping that has been thrust on users
because
the boundaries of the RDF spec cut off prematurely.  RDF has nothing
today
resembling union, and it should have.  Until it does, RDF-based
applications will not scale
gracefully to multiple model situations.

Cheers, Bob

At 01:54 PM 1/8/2004, Seaborne, Andy wrote:


Andreas,

You can create a union model and ask a query of that model: this will
ask
the query over both of the submodels.

There is no optimization done by the union model: because it does not
know
in which model properties might be, it ask each model for each part of
the
query pattern.  If both the models are in the same database, then,
currently, this would not be optimized to pass a single query down that
will
work against both models.  It could be but as there are different
possibilities for database layout, the number of possible optimizations
schemes grows.

Such optimizations are possible with the architecture, they just aren't
implemented.  You could write a specialization of the union model that
optimized queries base on application-specific knowledge of where
information is to be found.

      Andy

> ----Original Message----
> From: andreasharth <mailto:andreasharth@yahoo.com>
> Date: 8 January 2004 19:17
> 
> Hi,
> 
> I want to have a (RDQL) query that spans two or more models.  For
> example, get dc:title for <http://example.org/sample> from modelA and
> get dc:description for the same URI from modelB.  The query I have in
> mind is more complicated, but that's basically the functionality I am
> looking for.    
> 
> What would be the best way to query across multiple models in Jena2?
> 
> Thanks,
> Andreas.
> 
> 
> 
> 
> Yahoo! Groups Links
> 
> To visit your group on the web, go to: 
> http://groups.yahoo.com/group/jena-dev/ 
> 
> To unsubscribe from this group, send an email to: 
> jena-dev-unsubscribe@yahoogroups.com 
> 
> Your use of Yahoo! Groups is subject to: 
> http://docs.yahoo.com/info/terms/ 


  _____  

Yahoo! Groups Links 

*	To visit your group on the web, go to: 

*	http://groups.yahoo.com/group/jena-dev/ 

*	

*	To unsubscribe from this group, send an email to: 

*	jena-dev-unsubscribe@yahoogroups.com
<mailto:jena-dev-unsubscribe@yahoogroups.com?subject=Unsubscribe>  

*	

*	Your use of Yahoo! Groups is subject to the Yahoo! Terms of
Service <http://docs.yahoo.com/info/terms/> . 




=====================================
Robert MacGregor
Senior Project Leader
macgregor@isi.edu 
Phone: 310/448-8423, Fax:  310/822-6592
Mobile: 310/251-8488

USC Information Sciences Institute 
4676 Admiralty Way, Marina del Rey, CA 90292 
=====================================
Received on Saturday, 10 January 2004 04:55:18 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:52:04 GMT