Re: Using RDF to describe biological taxonomy. from Eric Jain on 2004-03-29 (www-rdf-interest@w3.org from March 2004)

From: Eric Jain <Eric.Jain@isb-sib.ch>
Date: Mon, 29 Mar 2004 17:39:31 +0200
To: "Chris Catton" <chris.catton@zoology.oxford.ac.uk>
Cc: "rdf-interest" <www-rdf-interest@w3.org>
Message-ID: <002d01c415a4$07404990$c300000a@caliente>

> don't address the question. We store the NCBI taxonomy in a nested-set
> representation in a postgres database. We've done this simply for
> speed of retrieval - I'd be interested to know how an rdf triple store
> performs - eg in response to a query like 'get all species of
> primates'.

This is something I'd be very much interested in as well. Some
non-biology examples would include queries such as:

  SELECT ?x
  WHERE
    (?x partOf* ?y)

to retrieve all x that are directly or indirectly a part of y, or

  SELECT ?x
  WHERE
    (?x rdf:type* ?y)

to retrieve all x that are instances of y or one of its subclasses.

One solution I am aware of is to have a rule engine and store inferred
statements explicitly.

  if
    x hasParent y
    y hasParent z
  then
    x hasAncester z

  if
    p occursIn x
    x hasAncestor y
  then
    p occursIn y

  SELECT ?p
  WHERE
    (?p hasAncestor ?x)

This approach of course increases the size of the data considerably and
furthermore complicates updates, so if anyone has come up with a more
elegant solution I'm very much interested.

Received on Monday, 29 March 2004 10:39:17 UTC