Re: One comment on RDF mapping [related to ISSUE 67 and ISSUE 81] from Bijan Parsia on 2008-06-13 (public-owl-wg@w3.org from June 2008)

From: Bijan Parsia <bparsia@cs.man.ac.uk>
Date: Fri, 13 Jun 2008 15:27:11 +0100
To: Alan Wu <alan.wu@oracle.com>
Cc: OWL Working Group WG <public-owl-wg@w3.org>
Message-Id: <ADC35D4C-67E9-46CC-AFC0-E578395C940B@cs.man.ac.uk>

On 13 Jun 2008, at 15:07, Alan Wu wrote:

> Bijan,
[snip]
>> It needs to be balanced by other considerations.
> That is fair. BTW, I forgot to mention that adding the axiom triple  
> won't cause a huge expansion of the ontology. Do we
> truly worry about, say 20%, size increase?

Sometimes. Do we really worry about a 20% increase at load time in  
the very extreme and unlikely worst case? How about 50%?

You still haven't answered the question: if we have lots of  
annotations, thus they are significant, and we have queries over  
those annotations, as seems likely, aren't you going to have to do  
something special with reification and annotations *anyway*?

>> As I've pointed out, it's not clear at all to me that in the  
>> situation you've outlined (lots of annotated triples in a large  
>> kb) that you can *avoid* the need for a sophisticated  
>> implementation. If people are querying for annotations, you have  
>> to do something to cope with mapping the reified triples to the  
>> non-reified one. Better to do that at load time.
> Well, it really depends. If an implementation chooses to optimize  
> the performance for query/inference over non-reifiied data and
> put a much lower priority on query over reified data, then such a  
> sophisticated implementation may not be necessary.

But then your use case isn't really precise. You want to optimize for  
the case where you have 100 million triples which are heavily  
annotated but no one will use your tool to query the annotations so  
you can essentially throw them away and go out of their way to make  
it hard to load the data.

Do these people *hate* you, or something? :)

Seriously, it seems like a pretty unlikely case. One where it would  
be perfectly reasonable to point them to a non-annotation triple  
extracting third party tool thingy. It doesn't seem a strong case to  
optimize for.

>> Plus, there's a clear bit of advice for people to optimize  
>> loading: Don't randomize your triples.
>>
> That is a good advice for tools that generate N-Triples :)

Indeed! :) If I ever update the RDF primer, I'll put that advice into  
it in very large type.

Cheers,
Bijan.

Received on Friday, 13 June 2008 14:25:02 UTC