RE: Yorick Wilks on Semantic Web & httpRange-14 from David Booth on 2012-05-19 (www-archive@w3.org from May 2012)

From: David Booth <david@dbooth.org>
Date: Fri, 18 May 2012 22:31:15 -0400
To: Larry Masinter <masinter@adobe.com>
Cc: Henry Story <henry.story@bblfish.net>, www-archive <www-archive@w3.org>
Message-ID: <1337394675.2232.93587.camel@dbooth-laptop>
Hi Larry,

On Thu, 2012-05-17 at 14:27 -0700, Larry Masinter wrote:
> > I think there is a place, as the A, B, C use case in your last message
> > illustrates.  The other channel of information is the URI definition
> > provided by the URI owner.  That convention provides an efficient,
> > scalable way for parties A, B and C who know nothing about each other to
> > easily agree on a common definition if they choose to do so.  This is a
> > useful benefit, even if it does not go so far as to ensure that they are
> > all giving the same meaning to that URI.
> 
> How does that work? What convention? 

Something like UDDP, which is an attempt at formalizing what has been
called 'follow your nose':
http://www.w3.org/wiki/UriDefinitionDiscoveryProtocol

> So you add "D" as the
> "owner" of the URI "slithy toves".   And D wants to tell the world
> "when you say 'slithy toves', it means something like a slimy toad but
> scarier"
> as D's definition.

Yes.

> 
> What is the "efficient, scalable" way in which A, B and C communicate
> in order to all agree to use D's definition? 

They each independently follow the convention, which mostly comes down
to dereferencing the URI to look for a URI definition from the URI
owner.

> How is their agreement
> "easy" ? 

It is easy because they do not need to communicate with each other in
advance.  They merely need to dereference the URI to look for its
definition.

> I mean, if they could agree to use D's definition, why can't
> they agree to use A's definition instead? Or B's? 

In principle they certainly could, but that would add complexity,
because then they would need some way to decide *which* definition to
use.  

> 
> Are there cases where D has to stay current in the conversation,
> and trusted to maintain the "definitions" that D originally might
> have made available?

Definitely.  If D initially publishes a URI definition, and that URI is
used in RDF statements (based on that definition), and D later changes
the definition arbitrarily or deletes it, then other parties that later
look up the definition will fail to get the correct definition.  This
may lead to what I've been calling 'community expropriation' of the URI:
http://dbooth.org/2009/lifecycle/#expropriation
In any case, it means that the URI owner has not been friendly to users
of that URI, and RDF authors will be less likely to use that owner's
URIs in the future.  This is a case of the marketplace selecting the
higher quality URIs.

Techniques like using a crypto-hash of the definition in the URI itself
can help guard against the definition changing, and since a crypto-hash
is by nature pretty unique, even if the URI can no longer be
dereferenced the hash could be used to search for the associated
definition.

> 
> >> If A says "slithy toves" to C and B uses the same term, and C wants
> >> further clarification of what A or B might have meant, the only
> >> authorities to ask are A and B. 
> 
> > I agree.  That use case is way beyond what a convention like the Uri
> > Definition Discovery Protocol (UDDP)
> > http://www.w3.org/wiki/UriDefinitionDiscoveryProtocol 
> > attempts to address.
> 
> I'm astounded, I gave what I thought was the simplest use case
> of communication using the semantic web. You have to have
> two senders and one receiver for there to be any ambiguity.
> 
> I don't see any use cases at all in 
> http://www.w3.org/wiki/UriDefinitionDiscoveryProtocol
> so it's hard for me to understand what problem you think you are
> solving with it.

I'm (slowly) working on a new version of that document, and I've been
wondering whether to include additional information such as use cases.
So far, I've been thinking that it would be best to keep it short and to
the point, and leave longer explanations of rationale and use cases to
other documents.

The basic use case is pretty simple.  Here's one that I just drafted:
[[
Owen is a URI owner who has published a URI definition for a URI
that (according to his URI definition) identifies the Eiffel
Tower.  Owen does not know who might use his URI definition,
but he wants it to be useful to others who wish to make RDF
statements about the Eiffel Tower.

Arthur and Aster are RDF statement authors.  Arthur publishes
RDF data about tall buildings, including the Eiffel Tower.
Aster publishes RDF data on the number of tourists who
visit famous landmarks each year, including the Eiffel Tower.
Arthur and Astor work completely independently and know nothing
of each other's work.  Nonetheless, they wish when possible
to use the same URI definitions for the URIs that they use,
so that other parties (such as Connie) can more easily merge
the RDF data that they publish.

Connie is an RDF statement consumer who discovers Arthur and
Aster's RDF data and wants her application to merge that data
to show both the height of the Eiffel Tower and the number
of tourists who visit it.  Connie's application should also
obtain the URI definitions for the URIs that it uses for the
Eiffel Tower, so that Connie can verify that her application
is displaying information on the correct notion of the Eiffel
Tower -- the tower itself, not the metro stop.

To satisfy this use case, the parties use a standard convention
-- the URI Definition Discovery Protocol (UDDP) -- as follows.

Owen follows the UDDP convention in allocating his URI for
the Eiffel Tower and publishing his URI definition for it,
so that people or automated agents that wish to locate his
URI definition can do so by dereferencing the URI.

Arthur and Aster independently discover Owen's URI for the
Eiffel Tower (perhaps through a search engine) and download
Owen's URI definition by dereferencing the URI.  They then
independently write their RDF statements using the same URI
definition for Owen's Eiffel Tower URI -- having no knowledge
of each other -- and publish their data.

Connie discovers Arthur and Aster's RDF data sets and uses her
application to merge them.  Because these data sets written
using the same URI definition, they are easier to merge, and
merge without conflict.  Connie's application is also able
to easily locate and download the URI definition for Owen's
Eiffel Tower URI by dereferencing the URI.
]]

Does that help?  


-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.
Received on Saturday, 19 May 2012 02:31:45 UTC