Re: binding a URI to a thing

Hi Jonathan,

Sorry it's taken me a while to get back to you on this, but I'll try to
address your questions below.

On Mon, 2010-06-14 at 12:41 -0400, Jonathan Rees wrote: 
> Here is a very long answer to David's question of why don't I understand
> "binding a URI to a thing".
> 
> Correct me if I'm wrong but I think what you're trying to do is to
> define a particular way to connect URIs to "things" that is governed
> "normatively" by documents ("URI declarations" or maybe "URI
> declaration core assertion documents") in the same way that, say,
> media type names
> or HTTP header names are controlled normatively by their registrations
> with the IETF.  

Yes, I think so, in that it sets up certain social expectations.  But
the requirements tend to be more SHOULDs than MUSTs.  For example, the
URI declaration that is used should normally be the one at the
follow-your-nose location, but there are exceptions, as explained in
http://dbooth.org/2009/lifecycle/

> I
> can see how you would want to get to this point - it would be the
> ultimate in democratic distributed extensibility, allowing anyone to write
> a spec and expect others to either accept it or reject it - but I think there
> are a number of obstacles to overcome before the idea can even begin
> to make sense.
> 
> Let me call the set of conventions you're trying to establish "UDMP"
> for "URI declaration meta-protocol".

Okay, but let me be clear that I did not originate the idea of URI
declarations.  I merely coined the term as a concise way to refer to the
existing concept, and I have also sought to crystalize and explain how
it works.  But the basic ideas have been common in the semantic web
community for quite a while.  For example, Dan Connolly's paper on "A
Pragmatic Theory of Reference for the Web" recommends:
[[
   1. To mint a term in the community, choose a URI of the form doc#id
and publish at doc some information that motivates others to use the
term in a manner that is consistent with your intended meaning(s).
   2. Use of a URI of the form. doc#id implies agreement to information
published at doc.
]]

And Sandro Hawke wrote in his RDF 2 Wishlist post: " . . . the use of
particular IRIs as predicates implies certain things, as defined by the
IRI’s owner".

And "Cool URIs for the Semantic Web" recommends:
[[
1. Be on the Web.
    Given only a URI, machines and people should be able to retrieve a
description about the resource identified by the URI from the Web. e
Web. 
]]

> 
> 1. In order for UDMP and the URI declarations it deputizes to be
> considered normative, there has to be some kind of signal that says
> you buy into the idea, and you've provided no such signal. That is,
> right now I can use a URI however I like, and I can deny that I
> treat or am obligated to treat the UDMP-specified "core assertions" as
> normative.
> To get UDMP to a point where it's as binding on anyone as, say, the
> HTTP protocol or XML specification (and that's not very binding!),
> you need to use a political process such as the W3 Rec
> track. And that means lots of review and lots of politics. The way you
> talk about
> UDMP, it sounds like a done deal.  That just isn't true.

Yes, I agree that it does need to be better recognized and publicized as
the way the semantic web is supposed to work.  And there are still a few
doubters, so in that sense I agree that it isn't a "done deal" yet. 

But I *do* think it is a done deal in the sense that it seems pretty
clear that this is the way the semantic web should work.  After all,
what other viable option is there?  The "competing definitions" approach
is clearly architecturally inferior, as explained in 
http://dbooth.org/2008/irsw/

> 
> 2. Engineering specifications are normative in some specific context -
> they do not apply to everything, but to particular classes of
> engineered artifacts such as HTTP servers or robot arms as they
> perform particular tasks.  To make it possible to consider UDMP
> normative you have to say to what kind of artifact it applies.  One
> might have an HTTP-compliant server or client; what kinds of things
> might be UDMP-compliant -- SPARQL endpoints maybe?

The context is the semantic web.  As a very close analogy, think of how
the AWWW describes principles for how the web should work, although it
transcends HTTP in specific.  

> 
> 3. You can't hold someone to a statement of the form "a is bound to b"
> under UDMP until you specify how one would test
> adherence to such a statement.  Like legal
> contracts, engineering specs only work to the extent they are
> operationally falsifiable: any party to the agreement has to be
> able to check, by observation, to see whether the protocol is
> being followed. But there is no way to test whether an agent
> "takes statements as true by definition".

Of course there is: look at what statements the agent has asserted.  If
you're checking compliance you have to assume that you are permitted to
observe the feature you're checking.  You cannot check a bank's
regulatory compliance without looking at its books.  

Also, non-adherence may be externally observable by users in a variety
of ways.  For example, if someone provided an application that behaved
in a way that was in violation of "a is bound to b" then it may confuse
its users.  If X is a necessary consequence from "a is bound to b", and
the application claims that X is false, then the problem shows up.

On the other hand, one can never control what an RDF consumer chooses to
do with the statements that it reads.  If the RDF consumer chooses not
to take the core assertions of a URI's declaration as true by
definition, then it is free to do so, *but* it risks violating the
social expectations of its user.  

The URI declaration does not *force* an RDF consumer to do anything in
particular.  Rather, it sets up a social expectation.  It is like a
public declaration in the town square by the senior Montague, saying
"Henceforth the name 'Romeo Montague' shall refer to my first-born son".
Thereafter, people can rightly expect that when you say 'Romeo Montague'
you are referring to the senior Montague's son.   If you make a
statement about 'Romeo Montague' but you are really talking about
someone else, people will be confused.  And if you read a statement that
refers to Romeo Montague, but you interpret it as referring to someone
else, then you may not be understanding the intended meaning of the
statement, although you are free to do so.

> 4. As I said I don't understand your use of "binding" (or "interpret")
> - you are not
> providing any way to check whether "a is bound to b" is true, nor are
> you telling me what consequences such a statement is supposed to have,
> or any way to make use of it.  If you are trying to use the word
> "binding" in a way that refers to some traditional use in science and
> engineering, as opposed to making up a new sense, tell me what comes
> closest:
> 
> 4a. In empirical linguistics, the binding of a phrase (such as a
> pronoun or proper name) to another noun phrase is a hypothesis about
> how a population of speakers behaves.
> 
> 4b. In logic, lambda calculus, and programming language semantics, one
> speaks of binding a name (variable, etc.) either to another phrase (as
> in linguistics) or to some member of some specified semantic domain
> (in denotational or model-based semantics). ("Thing" does not qualify as
> a semantic domain, as you have not specified it constructively.)

Yes, 4b is the sense I mean.  But I don't know why you say that "Thing"
does not qualify as a semantic domain.  Is resource not a semantic
domain?  The RDF Semantics document says:
http://www.w3.org/TR/rdf-mt/#urisandlit
[[
The things denoted are called 'resources', following [RFC 2396], but no
assumptions are made here about the nature of resources; 'resource' is
treated here as synonymous with 'entity', i.e. as a generic term for
anything in the universe of discourse. 
]]

Is owl:Thing not a semantic domain?  Or rdfs:Resource?

Furthermore, the binding is indirect: any particular RDF interpretation
http://www.w3.org/TR/rdf-mt/#interp
directly binds the URI to one specific resource, but until a particular
interpretation is selected, the binding is only to an ambiguous set of
possible resources, constrained by the graph in question.  In other
words, you can think of the binding between a URI and a resource as a
two-step mapping, as explained here:
http://dbooth.org/2009/denotation/
This ambiguity is the logical consequence of the RDF semantics.

> 
> 4c. If by "binding" you are referring to "interpretation" (or
> "satisfying interpretation") in the
> model-theoretic sense, I think this is a misreading of the nature and
> purpose of model theory.  Usually in model theory you're quantifying
> over interpretations, either in consistency proofs (existence of *any*
> satisfying interpretation) or in studying entailment (properties that
> hold of *all* satisfying interpretations).  It is indeed permitted in
> mathematical tradition to interpret terms such as x in f(x) however you
> like, but interpreting x as a dog is done either as a trick to
> emphasize just how
> arbitrary the grounding is,
> or it's done as part of a testable theory involving physical
> observation.  Ultimately you have to either do logic for fun, or
> ground your interpretations empirically, or just not worry about it
> and let interpretation be a completely private or application-specific deal.
> 
> An assertion <http://example.com/whitehouse> rdfs:comment "The White
> House" could be quite loosely described as binding a URI to the white
> house, in the way that the sentence "The dog had a flea." binds "it"
> to "the dog" (or the dog? which one?) so that an immediately following
> "It had three legs." is clearly a statement about the dog, not the
> flea.  The effect of saying that the RDF statement (graph,
> declaration, whatever) binds the URI to the White House, even if made
> operational (as the similar linguistic statement is), would be just
> shuffling words around and I think would get us all confused.  Give me
> an algorithm, not muddle of this sort.
> 
> 5. Let me try to imagine ways in which your theory could be *made*
> operational:

Operational in what sense?  It is already common practice.

> 
> 5a. By translation to natural language.  UDMP would specify an
> algorithm for converting an RDF graph into natural language text, and
> assuming agreement to UDMP you try to hold the other party (the one
> who deployed some artifact that has transmitted the URI) to the NL
> version of the graph that is the URI declaration.
> 
> In order for this to work, you and the other party have to agree on
> all the inputs to the algorithm - that is, how all the URIs resolve,
> the ones involved in UDMP itself together with all the ones
> transitively for UDMP on URIs mentioned transitivey.  For example, if
> the algorithm depends on doing GET to get wa-representations from some
> URIs, then the two parties might get different wa-representations and
> may not be able to agree about what agreement they've entered into.
> And then if one gets a 404 and the other doesn't...  No sane person
> would enter into such a meta-agreement without a very clear idea of
> what they're being held to, and we don't have any good explanation (yet) of
> how that can come about.

I think you're missing the point of this setting up social expectations
rather than a clear-cut, absolute contract.  There are degrees of social
expectations.  If a social expectation is very strong then a court may
enforce it.  Short of that, a party who violates the protocol may still
be acting antisocially.  But there is always judgment involved in social
expectations and adherence to them.

> 
> Of course this is why I've been focusing on understanding URIs for
> documents as a first step - but until we have a protocol for
> communicating promises of HTTP behavior stability, or some other way
> to refer to documents with agreed on properties, it seems premature to
> invest much in designing a protocol like UDMP that depends so deeply
> on it.
> 
> 5b. You might try to make UDMP normative for a SPARQL endpoint.  Maybe
> an endpoint conforms to UDMP if whenever a URI is seen in a SPARQL
> response, the endpoint can be "held to" the UDMP URI declaration for
> that URI.  Then we define "held to" in some way, e.g. an endpoint can
> be held to a graph (e.g. the URI declaration) might mean that the
> endpoint never appears (on the basis of SPARQL responses) to "believe"
> something in contradiction to the graph that is the combination of the
> endpoint's graph and the declaration.  (or maybe all declarations it
> has ever seen, or something.)
> 
> "Held to" would then assume some particular logic, so that both
> parties to the UDMP agreement agree, ahead of time, to what the
> endpoint can be held to.  ABLP logic is a good exampe of a protocol
> that does exactly this, e.g. it builds in modus ponens so that if A
> says 'p implies q' and A says 'p', then A can be held to 'q' even
> though it didn't say it explicitly. But what logic would you pick - RDF,
> RDFS, Common Logic, or one of the many OWL dialects?
> 
> The protocol will have to explain the intended lifetime of the
> agreement, or else there will be disputes over that.
> 
> 6. Even if you could do it, committing to a single interpretation is a
> bad idea.  

It doesn't requiring committing to a single interpretation.  It *bounds*
the permissible interpretations.  This is how RDF semantics works.

> It is of great value to be able to argue about the
> interpretation of words and to change one's mind.  While naively it
> might seem like ambiguity and judgment are annoyances, and of course
> they are in many instances, I think multiple interpretations,
> provisional interpretation, and reinterpretation are the essence of
> much of the human endeavor, especially in science and law.  This is
> the way progress is made: you explore consequences, change your mind,
> and change your assumptions.

That's fine as long as you're within the bounds that are determined by
the semantics.  But if you go outside of those bounds then you violate
social expectations.

> 
> Remember that the point of using a URI is for communication between
> a sender and a receiver, and it may be that neither one is the URI owner.
> Any documentation provided by the owner is there only as a coordination
> point, like a dictionary. 

Yes!  That is the basic idea.

> If the sender uses the URI "wrong" the
> receiver will want to
> figure out what the sender really meant, even if that is at variance with
> something the URI owner said. 

Yes!  And that is fine.  That is no different than one human writer
misusing a word and another reader attempting to understand what the
writer meant, recognizing that the word was misused.  As stated in
http://dbooth.org/2009/lifecycle/#event2 
[[
Statement author responsibility 3: Use of a URI implies agreement with
the core assertions of its URI declaration.
]]
and in the description of the RDF consumer's responsibilities:
http://dbooth.org/2009/lifecycle/#event3
[[
However, the consumer MAY select a different declaration. For example: 
      * . . . 
      * If the consumer believes that the current declaration has been
        compromised . . .  then the consumer might select an alternate
        declaration.
]]

> This is the difference between prescriptive
> grammars and descriptive ones - and the prescriptivists lose most of
> the time, because they have no stake and no power.
> 
> 7. Other theories of "URI meaning" are actually in use and we can
> either make use of them or parameterize the AWWSW project over them -
> thus there is no particular reason to commit to UDMP or any other
> protocol, even if we did agree to it and fleshed out the details.  For
> instance, Harry has a view that's completely unlike yours, but which
> is internally consistent and sensible.  Pat has yet another view,
> TimBL another.  

Really?  Can you point me to these views?

> If UDMP were further developed, both technically and
> socially, and had a following, 

Huh?  AFAICT, the general practice behind URI declarations is *already*
widely accepted by the community.  AFAICT, "Cool URIs for the Semantic
Web" is accepted best practice.  How much more of a following do you
want?  

> I would pay it more attention, but for
> now I have to continue saying it doesn't make sense.

In what sense does it not make sense to you?  If you read the process
for determining resource identity that is described in
http://dbooth.org/2010/ambiguity/paper.html#part3 
it is basically describing what people already *do*.

Please take a look at my paper from this year's SemTech conference:
http://dbooth.org/2010/ambiguity/paper.html
It is really not proposing anything more than what is already a logical
consequence of: (a) the RDF Semantics; and (b) what people already do.

It seems like we are using similar words but speaking different
languages, and I'm struggling to figure out how to bridge them.
Hopefully that paper will help to explain better what I mean.  

thanks!


-- 
David Booth, Ph.D.
Cleveland Clinic (contractor)
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of Cleveland Clinic.

Received on Friday, 25 June 2010 05:33:33 UTC