Re: How to integrate semantic web in a real application from Kjetil Kjernsmo on 2005-12-03 (semantic-web@w3.org from December 2005)

From: Kjetil Kjernsmo <kjetil@kjernsmo.net>
Date: Sat, 3 Dec 2005 01:30:30 +0100
To: semantic-web@w3.org
Message-Id: <200512030130.32286.kjetil@kjernsmo.net>
On Friday 02 December 2005 00:11, Fabien Schwob wrote:
> > Well, you do not mention what kind of tools that you have
> > available, but there is a bunch of us over in Perl-land who think
> > in the following terms:
>
> I will mainly use PHP or Python. And to create the ontology I'm using
> Protégé.

Sounds cool.

> > I haven't actually done this yet, but others have, and what we have
> > in mind is to use Redland in the bottom, configure it to use any of
> > the Redland storages, when you query the model, you get data back
> > as a result set, that you transform and present to your users.
>
> It's seems to be really interesting, and if I'm not wrong, Redland
> can be used with Python.

Yes, you can. And PHP, if I'm not wrong. So, basically, you just need 
something MVC-ish on top :-)

> > The advantages are great: You can perform any queries SPARQL
> > allows, and that's a lot. You can expose all data, so anybody can
> > use them for whatever they like, and you yourself can easily add
> > foreign sources to your model.
>
> I really like this idea ! But what about performance and scalability
> issues ? 

Since I have not yet done it practically, I can only tell from those who 
have shared some scattered details. What they say is that Redland 
performs well, but that there would probably be something to gain by 
having an XS interface rather than a SWIG interface for the Perl 
modules. But that's not interesting to you... :-)

As for scalability, that's actually one of the selling points. No more 
scary database dumps and recoveries just to change the datatype of a 
column! :-) New data, modelled differently, will trivially add itself 
to the data store, it's just triples! I've seen very short query times 
for my 2.7 million triples, much less than a second, and people are 
work on 20 million triples, so this is good.

> In my case, is it a good idea to store all informations in 
> some files ? 

I wouldn't. That is, if there are media files, articles (written in 
XHTML, Docbook, or similar) or pictures, then I'd store those in a file 
system, and have the metadata in the model.

However, I'd like to encourage you to stop thinking in terms of 
URL->filesystem, as in
http://www.example.com/foo/bar.html is a file /var/www/foo/bar.html. 
That's how most people think. But it doesn't have to be that way.
It is much more useful to think of the local part of the URL as just a 
hierarchical system of identifiers. Thus, you may have an object bar, 
that's a foo....

Or to stop using those confusing metasyntactical variables:
http://movies.example.org/movie/the_finest_movie
is an URL that very clearly tells what kind of resource this is. It is 
the main page you have for a movie called "The Finest Movie". 
Further, you may have modelled this movie with a 
ex:Movie dc:title "The Finest Movie" . 
and so you get the URL from there, in the local part 
/movie/the_finest_movie
the first part is derived from the subject, the second part is derived 
from the object.



> To continue with that, I have another question about 
> OWL. Must I save all my data in the same OWL file or must I cut the
> data in mutiple files ?

So, no. I wouldn't use files at all for this. I'd use Redland's model 
and add statements to it. Generate my URLs from the model. If you have 
the whole movie in and MPEG or something, then I would have that in a 
file, but not the OWL or RDF. OTOH, others on the web will find your 
data useful, and so, you would want to "serialise" the model to 
something that looks like a file and provide it to those who want it.

Best,

Kjetil
--
Kjetil Kjernsmo
Programmer / Astrophysicist / Ski-orienteer / Orienteer / Mountaineer
kjetil@kjernsmo.net
Homepage: http://www.kjetil.kjernsmo.net/     OpenPGP KeyID: 6A6A0BBC
Received on Saturday, 3 December 2005 00:30:18 UTC