Re: fast inferencing with jena and "?" from Yuzhong Qu on 2005-04-01 (semantic-web@w3.org from April 2005)

From: Yuzhong Qu <yzqu@seu.edu.cn>
Date: Fri, 1 Apr 2005 19:55:18 +0800
To: "Leo Sauermann" <leo@gnowsis.com>
Cc: "SWIG" <semantic-web@w3.org>
Message-ID: <00fc01c536b1$ad68b940$fd0b77ca@xobjects>
The performance problem is most likely laid in the similarity computing you just pointed out.

Based on our experience in the ontology matching area, this kind of similarity computing is very time-consuming, several minutes for RDFS (or OWL) Graphs with about forty user-defined entities.

I don't think this is the performance issue of inference with Jena.



Yuzhong Qu 


----- Original Message ----- 
From: "Leo Sauermann" <leo@gnowsis.com>
To: "Yuzhong Qu" <yzqu@seu.edu.cn>
Cc: "SWIG" <semantic-web@w3.org>
Sent: Friday, April 01, 2005 3:58 PM
Subject: Re: fast inferencing with jena and "?"


> 
> combined approach.
> 
> it is things like "a - has value - X"
> and "b - has value - Y"
> 
> and X - subclassof - Y
> 
> then they are "0.5" similiar.
> 
> if X == Y then they would be 1.0 similiar.
> 
> but our problems are much deeper,
> like the filtered statement iterator of jena....
> 
> 
> Es begab sich aber zu der Zeit 23.03.2005 02:52,  da Yuzhong Qu schrieb:
> 
> >This kind of matching problem is hard.
> >
> >BTW,
> >
> >Your Schema S is based-on OWL Lite, OWL DL or RDF(S)?
> >
> >Which kinds of  similarity do you consider?
> >
> >Linguistics similarity (enhanced with WordNet)
> >
> >Structural simularity (enhanced with inference capability)
> >
> >Or a combined approach.
> >
> >
> >Yuzhong Qu
> >
> >----- Original Message ----- 
> >From: "Leo Sauermann" <leo@gnowsis.com>
> >To: "Dave Reynolds" <der@hplb.hpl.hp.com>
> >Cc: <semantic-web@w3.org>
> >Sent: Wednesday, March 23, 2005 12:58 AM
> >Subject: Re: fast inferencing with jena and "?"
> >
> >
> >  
> >
> >>Hi Dave,
> >>
> >>actually a colleague of me is doing it and it is a commercial project we 
> >>do for a telecommunications company, so we can't publish the triples :-|
> >>
> >>roughly, its about checking if two graph A, B are "near" to each other,
> >>A,B describe resources and the resources are of Schema S
> >>now what we do is complete A and B by using S and then doing some graph 
> >>matching algorithm combined with property matching,
> >>so we combine A with S and B with S and then use A(S) and B(S) to do the 
> >>matching.
> >>
> >>like
> >>if type(A(S)) == type(B(S)) then "quite match"
> >>and forallPropertiesOf( prop(A(S)) == prop(B(S))) then add "quite match"
> >>...
> >>
> >>so there are  a few find(spo) that fire into the graph which the graph 
> >>does not like
> >>
> >>we'll try the new Jena release and see what happens.
> >>
> >>regards
> >>Leo
> >>
> >>Es begab sich aber zu der Zeit 21.03.2005 12:16,  da Dave Reynolds schrieb:
> >>
> >>    
> >>
> >>>Hi Leo,
> >>>
> >>>      
> >>>
> >>>>The problem with Jena is: the Model RDFS_MEM_TRANS_INF is too slow to do
> >>>>simple inference (and it was the fastest we found in jena)
> >>>>        
> >>>>
> >>>Which version of Jena? There was a bug fix affecting TRANS between 2.1 
> >>>and 2.2beta1 and a performance problem fixed between 2.2beta1 and 
> >>>2.2beta2.
> >>>
> >>>      
> >>>
> >>>>It has 200ms performance of matching two small rdf instance models
> >>>>against a RDF/S ontology model (180 classes). 
> >>>>        
> >>>>
> >>>What do you mean by "matching" a model against an RDFS model?
> >>>
> >>>If you can show us what you are doing (ideally a self-contained code 
> >>>example) then we might be able to advise on optimizations. Though code 
> >>>exchange is probably better done over on jena-dev or off list.
> >>>
> >>>      
> >>>
> >>>>We did everything we could to make it faster, including prefetching all
> >>>>classes, properties, trying out different Jena inferencers, etc.
> >>>>        
> >>>>
> >>>If you prefetched all classes and properties then there is presumably 
> >>>no inference left. If the performance wasn't good enough in that set 
> >>>up then you don't need faster inference you need a faster algorithm or 
> >>>reduced API overheads. That would make it even more interesting to see 
> >>>exactly what you are doing to figure where the performance problem is.
> >>>
> >>>Cheers,
> >>>Dave
> >>>
> >>>
> >>>      
> >>>
> >>
> >>    
> >>
> 
> 
>
Received on Friday, 1 April 2005 11:51:59 UTC