RE: New version of URI Declarations [Usage scenarios]

> From: Pat Hayes [mailto:phayes@ihmc.us]
> [ . . . ]
> There is a problem (more down-to-earth) with the notion of
> 'should be accepted', as well. This sounds like its
> impossible to enforce,

Correct. It is impossible to enforce because it is impossible to
know the user's intent. For example, if you use my URI to denote
the moon in some statements that you publish, there is no way for
other people's software that reads your statements to distinguish
between: (a) you chose to accept the assertions; and (b) you did
not choose to accept the assertions and violated this
architectural guidance. Therefore, your choice to use my URI to
denote the moon should be taken as prima facie evidence that you
agreed with the core assertions in my URI declaration.  If you
don't, you should use a different URI.

> and worse, that there are going to be
> cases where it shouldn't be enforced.  Suppose A publishes
> some rdf R and says it is a declaration of the URI x:y, and
> also says (perhaps using English or a controlled vocabulary)
> that x:y is supposed to denote, say, the moon; and suppose
> that R says that x:y is made of cheese. Are we supposed to
> accept this, just because this twit has called it a
> declaration?

No. But if you *choose* to use that URI to denote the moon then
you are indicating that you *have* accepted the assertions.
Therefore, if you do not wish to accept the assertions, then you
should use a different URI when you make statements about the
moon. You could even mint your own URI based on the subset of my
URI's assertions that you do choose to accept, in a manner
comparable to the URI substitution technique described here:
http://dbooth.org/2007/splitting/#urisub

>
>     These assertions are intended to delimit the range of
>     possible interpretations of what the denoted resource might
>     be -- ideally uniquely determining the resource, but (a)
>     that depends on the quality of the assertions, and (b) as
>     you pointed out, ultimately there is no way to ensure that
>     a user's actual interpretation is the same as the minter's
>     intent.
>
> Right, so the 'ideally' here isn't even an ideal: its
> impossible.

Yes and no.  I completely agree that it is impossible in
general, as you've aptly pointed out over the past couple
of years.  But for a given application, it may well be
adequate in uniquely determining the resource.  Thus,
someone publishing a URI declaration should strive to
make their assertions uniquely determine the resource
as best they can for the range of applications that they
anticipate, realizing that it is not possible to make
one that will be uniquely determining for all applications.

>   SCENARIO 1: Fred wishes to publish some RDF assertions
>   about a particular protein. He notices that Alice, Beatrice and
>   Carl have already published assertions about the protein, and
>   they all use the same URI to denote that
>   protein: the URI minted by Alice. Fred notices that if he
>   uses Alice's URI to denote the protein, his assertions will be
>   logically inconsistent with some of Alice's assertions,
>   although they are logically consistent with Beatrice and Carl's
>   assertions. He wonders whether he should publish his assertions
>   using Alice's URI -- and post a blog entry noting that his
>   assertions should not be used in conjunction with Alice's
>   assertions -- or mint a new URI.
>
>   Question: Should Fred use Alice's URI?
>
>   Answer: No.  He should mint a new URI and indicate the
>   relationship (not owl:sameAs) to Alice's URI -- at least
>   rdfs:seeAlso.
>
> Whoa. I think this is crazy. The scenario says that the URI
> denotes the protein, so lets accept that it indeed does.  (The
> 'attachment' to the particular protein may be done for example
> by relating the URI to a standardized protein database accepted
> by the community.) If this is so, then the only way that Fred
> and Alice can be inconsistent is if they actually disagree
> about the facts of the matter.  Perhaps Fred has a more
> up-to-date value for the molecular weight than Alice had, or
> something. But in this case, I think Fred should use the same
> URI to refer to the protein. Removing the inconsistency by
> using a different name is like saying: Oh, I guess we must be
> talking about different proteins, then. But they aren't, right?
> They are talking about the same thing, but they disagree on the
> facts. This is a case where some published RDF is *wrong*, and
> should be corrected: or at least, the real disagreement should
> be resolved.

The problem with Fred using Alice's URI is that there is no
way in general to distinguish between: (a) Fred attempting to talk
about a different resource than Alice was talking about; and (b)
one of them being wrong.

>From an architectural perspective, there is no objective notion
of right or wrong: some assertions are merely useful to some
applications, while others are useful to other applications. Even
assertions that you and I might think of as "wrong" may be
adequate and useful for any applications. For example, highway
mapping assertions that presume the earth is flat may be
perfectly adequate for many guidance applications.

> [ . . . ]
>   SCENARIO 3: Erin has accumulated some observations
>   about a different protein, and she wishes to publish them as
>   assertions. Some of them are merely assertions that serve to
>   uniquely identify the protein that she wishes to talk about.
>   Others are observations about the protein's behavior.
>
>
> Its not obvious that this distinction makes sense. Or at any
> rate, its not obvious that there is a particular category of
> facts that serve only to pin down reference.

I agree that there may be no way in general to distinguish
between assertions that serve to identify something and
other assertions.  This is why the "core assertions" in a URI
declaration are, by fiat, viewed as identifying assertions.

>
>   She is very confident about the correctness of the
>   first set of assertions, but no so confident about the
>   assertions about the protein's behavior. She mints a URI
>   http://example/erin/proteins#p4 for the protein and wonders
>   whether she should publish all of her assertions in one OWL
>   document at http://example/erin/proteins, or separate them
>   into two documents.
>
>   Question: Should Erin put all of these assertions in
>   a single document?
>
>   Answer: No.  Erin should separate them into two documents.
>
>
> Well, I agree that is good practice, but because she is more
> confident about some than about others. Thats the reason for
> the distinction, not that some pin down reference and others
> are mere facts.
>
> YOu have assumed that the reference-nailing assertions are
> also the ones that are known with confidence, but that begs
> an important question.

Yes, I made that simplifying assumption in this scenario.

> Consider a case where some empirical
> results are available which are very confidently true, but
> its not clear which protein they are relevant to (perhaps
> they were extracted from a biopsy which may have mixed two
> kinds of tissue: if this were genes instead of proteins, and we are
> talking about cancer typing, this is a real problem.) Now
> what do we 'declare' ?

You can declare a URI for the substance that was observed: a
potetial mix of two kinds of tissue.



David Booth, Ph.D.
HP Software
+1 617 629 8881 office  |  dbooth@hp.com
http://www.hp.com/go/software

Opinions expressed herein are those of the author and do not represent the official views of HP unless explicitly stated otherwise.

Received on Wednesday, 5 March 2008 07:07:35 UTC