Re: Immunity of SW statements to changes in location - data integration use case

Alan Ruttenberg wrote:
> Yes, but how will we handle the case where some set of people make 
> statements with the subject being
> http://beta.uniprot.org/entry/P12345 and another set makes statements 
> about http://uniprot.org/entry/P12345. They are really talking about the 
> same subject, but our semantic web agent won't know that. If we had used 
> the PURL, then we wouldn't have a problem.

I would like to repeat Alan's point and place it firmly in the context 
of data integration - the most important use case in my opinion.

During data integration or "data reuse", we have to relate statements 
about 'biothings' to each other in order to be sure that we can properly 
use someone else's statements/data. In that case, it is extremely 
convenient if we have used the same identifier to refer to the same 
'biothing'. We would also like our statements to remain true (based on 
the 'biothings' and their relations at the moment the statement was 
made, even if some aspect of the data evolves (physical storage 
location, new results, new relations, etc.).

So, we have the following set of requirements:
1) unique unambiguous universal identifiers for classes and instances of 
our 'biothings'
2) a) permanent identifiers
    --OR--
    b) versioned identifiers
    (or versioned purls?)
3) w3c/sw compatible identifiers

As far as I can tell, those are the only requirements! I, personally, 
could even live without 2)b) if we could just agree on how to accomplish 
1). So, we need universally recognized URI's for (bio)concepts. Ideally, 
these would come directly out of an ontology so that it is clear what we 
are talking about, right?

For the representation of scientific 'truth', non-versioned identifiers 
will eventually break, although they would probably remain practically 
useful for many years of mainstream research. That's why I think that 
versioning is a more durable solution.

Optional (but NOT unimportant):
* ability to refer to the resource referred to by a URI (not just the 
OWL class/concept itself), e.g. HTML
* 'human-readable' identifiers
* ability to make statements about statements, e.g. evidence, 
provenance, etc.

-scott

-- 
M. Scott Marshall
http://staff.science.uva.nl/~marshall
http://adaptivedisclosure.org

Received on Monday, 16 July 2007 09:28:40 UTC