Re: PageRank approaches from Michael Brunnbauer on 2012-05-15 (semantic-web@w3.org from May 2012)

From: Michael Brunnbauer <brunni@netestate.de>
Date: Tue, 15 May 2012 17:19:36 +0200
To: Aidan Hogan <aidan.hogan@deri.org>
Cc: Enrique Pérez Arnaud <enriquepablo@gmail.com>, Nathan <nathan@webr3.org>, semantic-web@w3.org
Message-ID: <20120515151936.GA13059@netestate.de>

Hello Aidan,

Impressive! And the PageRank part is quite what I had in mind. Also nice to   
see how well it goes together with annotated reasoning and how it can help
to fix inconsistencies automatically - which seems to be an important part
of the problem (only 0.0006% of the corpus above consistency treshold).

Still, the OWL subset used is very limited and I hope that progress is
possible.

To answer my own question that started the thread: Applying PageRank to
collections of triples (documents or datasets) seems to be well established
(I also saw this in Jörn Hees list). Furthermore, approaches where the link
does not go from subject to object or vice versa exist.   

Regards,

Michael Brunnbauer

On Mon, May 14, 2012 at 02:42:17PM +0100, Aidan Hogan wrote:
> Hi Michael,
> 
> We looked at combining PageRank and lightweight reasoning before.
> 
> Piero A. Bonatti, Aidan Hogan, Axel Polleres and Luigi Sauro. "Robust 
> and Scalable Linked Data Reasoning Incorporating Provenance and Trust 
> Annotations". In the Journal of Web Semantics 9(2): pp. 165?201, 2011.
> 
> http://sw.deri.org/~aidanh/docs/saor_ann_final.pdf
> 
> We took a dataset of 1.1 billion statements from 4 million Web 
> documents. Ranking was done over documents based on dereferenceable 
> links. Triples were ranked as the sum of the PageRank of the documents 
> they appear in. An annotation framework was used to produce ranks for 
> inferences (for a small subset of OWL 2 RL). We then used the triple 
> ranks to remove the "weakest" triples involved in inconsistencies.
> 
> Hope you find it interesting.
> 
> Cheers,
> Aidan
> 
> On 14/05/2012 10:29, Michael Brunnbauer wrote:
> >
> >Hello Enrique,
> >
> >if you have a conjunction of statements (a set of triples) and you take one
> >statement (triple) away, you can conclude less but you cannot conclude
> >something that is wrong with the statement (triple) you took away.
> >
> >Regards,
> >
> >Michael Brunnbauer
> >
> >On Sun, May 13, 2012 at 11:09:21PM +0200, Enrique Pérez Arnaud wrote:
> >>Any reasoning software necessarily defines validity. Not what is and what
> >>is not true, but what kind of information may or may not be true, and what
> >>would also be true if what may be true is so.
> >>
> >>Would not PageRank compromise the validity of the reasoning software that
> >>had to analyze its results?
> >>
> >>I mean, for human searches PageRank may be reasonable, but is it so for
> >>logical searches?
> >>
> >>--
> >>Enrique Pérez Arnaud
> >>enriquepablo@gmail.com
> >

-- 
++  Michael Brunnbauer
++  netEstate GmbH
++  Geisenhausener Straße 11a
++  81379 München
++  Tel +49 89 32 19 77 80
++  Fax +49 89 32 19 77 89 
++  E-Mail brunni@netestate.de
++  http://www.netestate.de/
++
++  Sitz: München, HRB Nr.142452 (Handelsregister B München)
++  USt-IdNr. DE221033342
++  Geschäftsführer: Michael Brunnbauer, Franz Brunnbauer
++  Prokurist: Dipl. Kfm. (Univ.) Markus Hendel

Received on Tuesday, 15 May 2012 15:20:10 UTC