RE: comments on the uri note from Michel_Dumontier on 2007-11-04 (public-semweb-lifesci@w3.org from November 2007)

From: Michel_Dumontier <Michel_Dumontier@carleton.ca>
Date: Sun, 04 Nov 2007 13:16:03 -0500
To: Jonathan Rees <jar@creativecommons.org>
Cc: naty.vr@gmail.com, public-semweb-lifesci@w3.org
Message-id: <AB349814F1ECB143A5D4CD29C7A6456901F67332@CCSEXB10.CUNET.CARLETON.CA>
Jonathan,
 I regret not mentioning this in the first place, but thanks for the
time and effort to put the note together. It makes debating it so much
easier ;-)

In general, I think my objections are related to philosophical
recommendations that are difficult or impossible to follow. I would
expect that focusing on simple, practical guidelines with little guess
work would be most effective.

> >  "A usage spec for a name is simply a graph that is designated as
one
> > that specifies when the name should and shouldn't be used"
> >
> > Given that RDF semantics are open world, and RDF lacks the formal
> > vocabulary for negation or universal quantifiers, I don't see how
one
> > can constrain usage, as no inconsistency can result from the
> > addition of
> > new knowledge. The example that is provided is not constraining, but
> > rather states what we know about that particular entity, at a
certain
> > time, presumably from a certain location on the internet.
> 
> I thought it was clear that I was not talking only logic.  You can
> specify all sorts of unmodeled things in natural language, and this
> language can go in an rdfs:comment. I'll try to make this more clear.

I just think it's limiting and less practical to specify how something
should _not_ or should _only_ be used, even in natural language. 

> > A major concern I have with the note is that it essentially says
that
> > only the naming authority can make "defining" statements about some
> > URI.
> > Such an approach would severely hinder people from reusing URIs, as
> > they
> > may wish to make additional statements that are undoubtedly not
> > covered
> > by the authority's definition. Such advocacy would simply lead
> > people to
> > mint their own URIs, leading to heavy fragmentation of the semantic
> > web,
> > in which only our knowledge about something might be limited due to
> > "see
> > also" links between instances.
> 
> "Additional statements" are not defining and are not meant to impose
> usage constraints on ALL users of the name, even if they could. They
> are merely about, just as Austen's novel Persuasion is about
> persuasion and uses the word (name) 'persuasion', but doesn't define
> 'persuasion'. That doesn't mean the additional statements aren't
> believed; they may or may not be incorporated into any given theory
> (logical or otherwise).
> 
> You can always say what you like about any defined term; doing so
> does not change the definition.

In some model, yes.


> Perhaps we don't agree about what is meant by "defining" (although
> I'm puzzled that you use that word because I don't use it in the
> draft), and I need to be clearer about that.
> 
> > "The property rdfs:seeAlso specifies a resource that _might_ provide
> > additional information about the subject resource" [2]
> >
> > [2] http://www.w3.org/TR/2000/CR-rdf-schema-20000327/#s2.3.4
> >
> > Unless there is a stronger link between differently named resources,
> > such as owl:sameAs, it certainly can't be interpreted that they are
> > the
> > same, thus the statements will not be merged. However, if the
resource
> > points to another document making statements about the URI, or
> > makes use
> > of owl:sameAs,  this will lead to the merging of statements that
might
> > go beyond the original "definitions" of any one authority.
> 
> I wish I could understand what you're saying. Can you give a concrete
> example?
> 
> A seeAlso does not constitute an assertion that you should believe
> what's at the other end. That would be reserved for something
> stronger like owl:imports (if I understand it correctly).

I think my issue is related to query answering across seeAlso links. If
two people make statements about the same thing but use different URIs,
in the absence of owl:sameAs, you will not discover additional
information about that resource that will lead to an inference.


> > I don't believe that the statement "The declaration should be
specific
> > enough to rule out incorrect usage, but not so specific that it
> > overcommits and fosters inconsistency or discourages reuse." is
> > possible
> > to adhere to.
> >
> > Here are things that I consider:
> > A Universal Resource Identifier is a string of characters that
denotes
> > the name of some resource.
> Sorry, I am saying that the URI doesn't *denote* the name, it *is*
> the name. And it can name anything, not necessarily a resource.

Denote: have as a meaning

I agree it can name anything.

> > 1 - create a URI that is consistent with the corresponding
> > protocol. For
> > instance, HTTP URIs can only be composed of a certain set of
> > characters
> > defined by [some url], and LSID URIs have their own specification
> > [another url], etc, etc...
> I can add language to this effect if you think it's important.  For
> now I just say you have to follow URI syntax. I had thought that
> would imply protocol-specific syntax
> 
> > 2 - reuse a URI if you believe that your use of that resource is
> > expected to be consistent with the original intent. In the absence
of
> > expressive logics with negation, it will not be possible to
> > computationally check if the meaning is consistent.
> Correct. My proposal is to take the usage spec as defining, in
> natural language if necessary, the original intent.
> 
> > 3 - you might consider minting a URI that is identical in intent,
but
> > you like to track your contributions (provenance). In this case, you
> > make statements to your URI, and should consider using owl:sameAs to
> > indicate that the two resources should be considered equivalent.
> It sounds like we have very different ways of dealing with provenance
> and trust, and it would probably be profitable to talk about these
> differences. As you aren't the first to mention this approach, it's
> probably important that we try to figure out the differences are in
> underlying assumptions, since otherwise we'll just talk past each
other.

Sounds like a plan.


> > Since a name isn't sufficient for understanding its meaning, we
> > suggest
> > that you augment every RDF/OWL resource with:
> > 1 - a concise human readable label using rdfs:label in the language
of
> > choice
> yes. i am reluctant to impose even more requirements/advice, but
> label is probably warranted.
> > 2 - a precise human readable definition using rdfs:comment in the
> > language of choice.
> yes, assuming there's any doubt that the logical definition doesn't
> capture all important information about usage (e.g. I think the
> pathology example doesn't need prose)

I understand what you're saying when talking about raw data, but I think
that for class based resources, human readable definitions are
essential.

> > 3 - RDF statements that you believe to be universally true about
that
> > resource
> yes. I would have to define "universally": I think it means (from a
> formal viewpoint) that every theory that uses the name is asked to
> take the usage spec as axiomatic.
> > 4 - or point to documents that make statements about that resource
> > using
> > rdfs:isDefinedBy.
> Fine, but if the statements are to be universally true why point to
> them? And what document-reference connective is strong enough to mean
> that the statements in the other document are to be axioms?
> owl:imports I guess.

Well, you believe them to be universally true... of course, you may be
wrong and what you say conflicts with another world view. Thus, we can
create a multitude of world views, and you pick the one you agree with
such that it is useful to derive meaningful inferences (i.e. import).

> I don't like isDefinedBy because it relates the thing to the defining
> document. But the thing just is; it doesn't need defining. It is the
> name that needs to be defined. E.g. you could have two names for the
> same thing, defined in different ways. Definition, IMO, is
> extralogical since it quantifies over *all* conforming theories.
> Definition is closer to a kind of modularity (axiom sets that get
> incorporated into many theories) and is therefore meta-meta.
I like isDefinedBy because you can find different axioms about that
entity. You may not agree with all those views, but hopefully there is
at least one. Thus, we have a more flexible approach to reuse.

> > As an example, I built a prototype HTTP URI resolver for the
entities
> > defined in my most current OWL ontologies:
> >
> > http://134.117.55.46:8181/Protein ,
> >
> > where 134.117.55.46:8181 will eventually be
ontology.dumontierlab.com
> >
> > In this way, a human can see the implied meaning, and an agent can
> > follow other documents to determine what has been said about it (at
> > least within my own knowledge base).
> 
> Tell me why you don't use the "eventual" term in the first place? If
> it's a resolution issue then something like resolution rules or a web
> proxy could help out here. If it's really a logical or provenance
> issue then I don't understand you (and I want to).
It's just a prototype deployed at a different server, that's all. It'll
all work soon enough with the right URIs.

> When I want to make provisional assertions so that they can be tried
> out (for consistency, linking, application behavior, etc. I isolate
> them from other assertions by setting up a separate theory (graph,
> scope, ...). Entailment while I play around comes only from the
> provisional graph, and entailment in some other, better established,
> maybe shared theory is not threatened by the statements I make in the
> flakier place. 
I think the difference that in each document I make different statements
about the entity - I don't believe them to be provisional, nor flaky.
They're assertions that reflect my knowledge, which may be best
partitioned depending on my requirements.


> It sounds like you're doing the same kind of thing
> but  within a single graph/theory and modulating names to simulate
> multiple graphs (you would say I resort to multiple graphs to
> simulate renaming). I would like to hear how these approaches are
> substantially different (not just syntactically or organizationally).
> If choice between these two approaches bleeds through into a naming
> discussion then that's a threat to sharing and we need to talk about
it.
> Tell me I'm wrong...
I'm not sure I really understand. I reuse the same name in different
contexts. These may or may not be all-together compatible, unless I
explicitly import them together.

> > What remains lacking is a method by
> > which we can discover what other people have said about this
resource.
> Excellent, I'd love to see a protocol for this. Currently I use
google.

I'm not sure how well one can retrieve axiomatic statements about a
given RDF/OWL URI using Google.

http://www.google.ca/search?q=http%3A%2F%2Fontology.dumontierlab.com%2FP
rotein&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=fire
fox-a

moreover, I don't think Google indexes triple stores.

> > That's why I'm fond of the http://lsrn.org (centralized) solution in
> > which multiple data providers can register as a resolver a given
base
> > URI, so that people and agents can find out more about it (via HTML/
> > RDF
> > documents)[3].
> > Moreover, it allows third party data providers to
> > register a public identifier, and resolve it (in RDF documents)
> > prior to
> > the authority having to do so! Analogously, in the LSID protocol
> > (distributed), resolvers can register with the authority itself and
> > provide different information.
> Sounds a little like wikipedia. Where is the LSRN protocol
> documented? 
http://lsrn.org/faq.html

> How do I add to the registry?
you edit.

> Where is this LSID multi-
> resolver protocol documented? 
ftp://ftp.omg.org/pub/docs/dtc/04-05-01.pdf

Where does the LSID resolver list come from, and how do I add to it?
http://biomoby.open-bio.org/index.php/for-developers/

> >
> > [3] http://lsrn.org/CAS:58-08-2
> >
> > Thus, I dislike anything that is "authoritative" or "monopolizing"
> > if it
> > handicaps URI reuse and precludes the discovery of additional
> > information about that resource.
> Absolutely. But I don't see anything in the apparatus of OWL and/or
> the semantic web that precludes you from looking anywhere for any
> kind of information. E.g. I could keep track of a set of my favorite
> SPARQL endpoints to consult when I have questions.  I think what
> you're saying is that you like having a nonauthoritative but central
> point of contact, which to me sounds like a service analogous to
> google (uncurated), DMOZ (curated), or wikipedia (contribution-
> based). If this is so I need to learn more about LSRN.
It's probably worth investigating.

> The alternative to the "authority" of first published published
> dictionary definition (usage spec) is consensus process inducing what
> a term means. Although natural language is mostly defined by use, not
> prescription, I don't think this is viable for computational purposes
> (not necessarily limited to deduction) and it doesn't agree with my
> reading of how the scientific literature best develops. Yes, even
> when there are explicit definitions, there is overloading (others
> define the term differently), 
Right, because it's hard to make definitions that stand the test of
time.

> but good scholarship couples use of a
> technical term with a reference to the source that made the
> definition you're assuming. 
And that definition is generally in natural language.

> The URI in effect provides reference (to
> concept) and reference (to publication) in one module. The
> "authoritative" aspect here is just priority of publication, coupled
> with the uninteresting detail of collision avoidance through domain
> ownership. In a more recent version of the draft I've further
> downplayed the idea of "authority" - I now call it "minting
> authority" and explain that you lose control once you publish - your
> publication has to stand on its own unless a special contract is
> otherwise established that gives you some special ongoing
> relationship with the name. I hope we agree in this.
Agreed.

> Should one expect a referral to a registry from this document? I
> would have said this is out of scope.
Yes, I agree. It's related, and we can talk about it in a new thread.


> >
> > Just my two cents,
> 
> Your cents matter to us.
Thanks!!!

-=Michel=-
Received on Sunday, 4 November 2007 18:16:41 UTC