Re: URIs from John Barkley on 2006-06-19 (public-semweb-lifesci@w3.org from June 2006)

From: John Barkley <jbarkley@nist.gov>
Date: Mon, 19 Jun 2006 07:03:38 -0400
To: <public-semweb-lifesci@w3.org>
Message-ID: <1c1d01c69390$04c26420$8a3a0681@ncsl.nist.gov>
hi alan,

> On the matter of what a URI dereferences to, I think it is more
> important to get the names in place quickly.

I agree. I think we are all ready to start on the demo. Nonetheless, getting
the names in place quickly does not mean they cannot dereference. According
to:
http://www.w3.org/TR/owl-ref/#Property and
http://www.w3.org/TR/rdf-schema/#ch_range, subjects and objects of
properties are "instances" of classes which means they dereference.
Presumably, the names we define will be used early on as subjects and/or
objects of ObjectProperties. Unless the names are only to be objects in
DatatypeProperties, then they should dereference to their definitions as an
individual of a class.

I would suggest that, for starters, we do something like create a "bare
bones" ontology that simply has classes of vocabulary names, e.g., "Disease
Names", "Antibody Names", where, whatever names we choose are individuals in
those classes. This minimal ontology allows names that dereference to be put
in place quickly and can evolve into an ontology on which all can agree.
Additionally, this ontology serves as a vehicle for documenting the names.

I think we'll find that throughout the development of the demonstration,
nothing is immutable - that the development will be an evolution as we learn
more. I believe this is part of the "social process" that you describe. The
conversion of each database is automated. Changing the RDF output means
changing some code. Developing the bridge ontology (or ontologies) will
likely be semi-automated - perhaps less easy to change, but not beyond our
means.

jb


----- Original Message ----- 
From: "Alan Ruttenberg" <alanruttenberg@gmail.com>
To: <public-semweb-lifesci@w3.org>
Sent: Friday, June 16, 2006 2:51 AM
Subject: URIs


>
> There was an discussion a few weeks ago about URIs touch on various
> issues. This message is an attempt to untangle them, something I said
> I would write up as an action item in one of the HCLS conference
> calls. We'll be discussing URIs at the monday BioRDF conference call.
>
> As I read the discussion I partitioned it in to three distinct issues:
>
> 1) The relationship between the use of a URI in a representation and
> what it dereferences to, if anything. The possibilities seem to be:
>
>    a) The identifier is not intended to be dereferencable. In that
> case the info: scheme was suggested for the form of the uri, as that
> is explicitly not dereferenceable.
>
>    b) The URI is used primarily as a name. Insofar as we want use
> names, it is important there be some stable URIs. Of course it
> doesn't hurt if the URI becomes dereferenceable at some point, and it
> would even be nice, so let's leave open that possibility (but caveats
> in discussion below)
>
>    c) Any URL we use needs to be able to be dereferenced to something.
>
>    d) Any URL we use needs to be able to be dereferenced to the thing
> it is (and not dereferenced if you can't do that). It's only meaning
> is what it dereferences to.
>
> 2) What a URI refers to. Some of this conversation was made in the
> form of a discussion about what reasonable arguments to owl:sameAs
> are - for example should one say that http://www.expasy.org/uniprot/
> P04637 is the sameAs http://eutils.ncbi.nlm.nih.gov/entrez/eutils/
> efetch.fcgi?db=protein&id=NP_000537.
>
> Another part of the conversation talked in terms of whether the URI
> http://www.expasy.org/uniprot/P04637 should, for our purposes, refer
> to a database record or to a thing in the world - Human P53 proteins.
>
> Of course these are two sides of the same coin - you would only say
> they the two URIs above referred to things in the world. As database
> entries, they are obviously different. There are different fields,
> they are in maintained by different people, etc.
>
> 3) Something I will call the social aspect of URIs, for lack of a
> better term. By this I mean those aspects process we go through to
> come to a shared use of of URI. Under this category there is the
> ontology building, the strategies for connecting pieces of
> information generated by different groups. There was a bit in the
> conversations where people were arguing about whether using sameAs
> for mapping was pollution or a necessity, for instance. An important
> part of this in our context is how to define the use of URLs to
> things where there was not rigorous ontological engineering applied
> to create careful definitions, things like terminologies and entries
> in gene databases.
>
> ---
>
> I'll offer some of my own opinions on these issues now.
>
> On the matter of what a URI dereferences to, I think it is more
> important to get the names in place quickly. I don't agree with the
> point of view that we should explicitly make them not
> dereferenceable, even though I'm not sure what should come back when
> we ask for what they point to yet. And I don't see support for there
> being a necessity that anything that looks like a URL have a server
> that returns something specific back. Here's a quote from RFC 3986,
>
> > Although many URI schemes are named after protocols, this does not
> > imply that use of these URIs will result in access to the resource
> > via the named protocol.  URIs are often used simply for the sake of
> > identification.
>
> It will part of our social process to come to some understand and
> agreement about what would be useful for us to have come back, if
> anything. Is it an RDF graph? A bunch of OWL definitions of things
> related to the gene? A representation of the asn record? A page of
> HTML? All of the above?
>
> On the question of what kind of concept an entrez gene URI refers to,
> I think that concept needs to be "databaseRecord". There's too many
> different concepts that it could mean if we want it to refer to
> something in the world - does it refer to the sequence of the gene?
> The typical gene? All mutations of it that are found in populations?
> The possible gene products?
>
> Rather, we can use the URI to the database entry to start to build
> concepts by defining properties and using them in OWL class
> definitions in a variety of ways. In foaf and SKOS, for instance,
> there is a property isPrimarySubjectOf. The kind of equivalence we
> can have between http://www.expasy.org/uniprot/P04637 and http://
> eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?
> db=protein&id=NP_000537 is something like: The same something
> isPrimarySubjectof http://www.expasy.org/uniprot/P04637 and  http://
> eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?
> db=protein&id=NP_000537.
> where "something" is a blank node in RDF.  Or in OWL
>
> Class(P53Gene complete
>      restriction(isPrimarySubjectof
>                    (value <http://eutils.ncbi.nlm.nih.gov/entrez/
> eutils/efetch.fcgi?db=protein&id=NP_000537>)))
>
> Class(P53Transcript partial intersectionOf(mRNA restriction
> (derivesFrom someValuesFrom(P53Gene))))
>
> Which says that it is necessary and sufficient for x to be a
> P53Gene,for example, if someone
> has stated or it has been inferred that
>
> Individual(x value(isPrimarySubjectOf <http://www.expasy.org/uniprot/
> P04637>))
>
> and that a P53 transcript, among other things,  is a mRNA that
> derivesFrom some P53Gene.
>
> (there will be more complicated definitions too :)
>
> [sameAs, equivalentClass, equivalentProperty will be a necessity, I
> think, BTW]
>
> As for the social process, I look forward to the discussion on Monday :)
>
> Regards,
> Alan
>
>
> http://www.w3.org/TR/uri-clarification/
> Uniform Resource Identifier (URI): Generic Syntax - http://
> tools.ietf.org/html/3986
> Relations in biomedical ontologies - http://genomebiology.com/
> 2005/6/5/R46
> http://en.wikipedia.org/wiki/Uniform_Resource_Identifier
> http://en.wikipedia.org/wiki/URL
>
>
Received on Monday, 19 June 2006 11:06:26 UTC