comments on the uri note from Michel_Dumontier on 2007-11-03 (public-semweb-lifesci@w3.org from November 2007)

From: Michel_Dumontier <Michel_Dumontier@carleton.ca>
Date: Sat, 03 Nov 2007 15:02:00 -0400
To: Jonathan Rees <jar@creativecommons.org>, naty.vr@gmail.com
Cc: public-semweb-lifesci@w3.org
Message-id: <AB349814F1ECB143A5D4CD29C7A6456901F6732A@CCSEXB10.CUNET.CARLETON.CA>
Hi all,
  I read the latest URI note [1], and here are some comments:

[1] http://sw.neurocommons.org/2007/uri-note/


 "A usage spec for a name is simply a graph that is designated as one
that specifies when the name should and shouldn't be used"

Given that RDF semantics are open world, and RDF lacks the formal
vocabulary for negation or universal quantifiers, I don't see how one
can constrain usage, as no inconsistency can result from the addition of
new knowledge. The example that is provided is not constraining, but
rather states what we know about that particular entity, at a certain
time, presumably from a certain location on the internet. 

A major concern I have with the note is that it essentially says that
only the naming authority can make "defining" statements about some URI.
Such an approach would severely hinder people from reusing URIs, as they
may wish to make additional statements that are undoubtedly not covered
by the authority's definition. Such advocacy would simply lead people to
mint their own URIs, leading to heavy fragmentation of the semantic web,
in which only our knowledge about something might be limited due to "see
also" links between instances. 

"The property rdfs:seeAlso specifies a resource that _might_ provide
additional information about the subject resource" [2]

[2] http://www.w3.org/TR/2000/CR-rdf-schema-20000327/#s2.3.4

Unless there is a stronger link between differently named resources,
such as owl:sameAs, it certainly can't be interpreted that they are the
same, thus the statements will not be merged. However, if the resource
points to another document making statements about the URI, or makes use
of owl:sameAs,  this will lead to the merging of statements that might
go beyond the original "definitions" of any one authority. 

I don't believe that the statement "The declaration should be specific
enough to rule out incorrect usage, but not so specific that it
overcommits and fosters inconsistency or discourages reuse." is possible
to adhere to.


Here are things that I consider:
A Universal Resource Identifier is a string of characters that denotes
the name of some resource.
1 - create a URI that is consistent with the corresponding protocol. For
instance, HTTP URIs can only be composed of a certain set of characters
defined by [some url], and LSID URIs have their own specification
[another url], etc, etc...
2 - reuse a URI if you believe that your use of that resource is
expected to be consistent with the original intent. In the absence of
expressive logics with negation, it will not be possible to
computationally check if the meaning is consistent.
3 - you might consider minting a URI that is identical in intent, but
you like to track your contributions (provenance). In this case, you
make statements to your URI, and should consider using owl:sameAs to
indicate that the two resources should be considered equivalent.

Since a name isn't sufficient for understanding its meaning, we suggest
that you augment every RDF/OWL resource with:
1 - a concise human readable label using rdfs:label in the language of
choice
2 - a precise human readable definition using rdfs:comment in the
language of choice. 
3 - RDF statements that you believe to be universally true about that
resource
4 - or point to documents that make statements about that resource using
rdfs:isDefinedBy.

As an example, I built a prototype HTTP URI resolver for the entities
defined in my most current OWL ontologies:

http://134.117.55.46:8181/Protein , 

where 134.117.55.46:8181 will eventually be ontology.dumontierlab.com


In this way, a human can see the implied meaning, and an agent can
follow other documents to determine what has been said about it (at
least within my own knowledge base). What remains lacking is a method by
which we can discover what other people have said about this resource.
That's why I'm fond of the http://lsrn.org (centralized) solution in
which multiple data providers can register as a resolver a given base
URI, so that people and agents can find out more about it (via HTML/RDF
documents)[3]. Moreover, it allows third party data providers to
register a public identifier, and resolve it (in RDF documents) prior to
the authority having to do so! Analogously, in the LSID protocol
(distributed), resolvers can register with the authority itself and
provide different information. 

[3] http://lsrn.org/CAS:58-08-2

Thus, I dislike anything that is "authoritative" or "monopolizing" if it
handicaps URI reuse and precludes the discovery of additional
information about that resource. 

Just my two cents,

-=Michel=-
 
Michel Dumontier
Assistant Professor of Bioinformatics
http://dumontierlab.com
Received on Saturday, 3 November 2007 19:02:30 UTC