Re: semantic web, communication, etc. from David Booth on 2012-05-25 (www-archive@w3.org from May 2012)

From: David Booth <david@dbooth.org>
Date: Thu, 24 May 2012 22:28:41 -0400
To: Larry Masinter <masinter@adobe.com>
Cc: Jonathan A Rees <rees@mumble.net>, www-archive <www-archive@w3.org>, "Henry Story (henry.story@bblfish.net)" <henry.story@bblfish.net>
Message-ID: <1337912921.2090.8009.camel@dbooth-laptop>
Hi Larry,

On Thu, 2012-05-24 at 14:15 -0700, Larry Masinter wrote:
> Reducing the distribution list for now....
> 
> I think 
> 
> http://lists.w3.org/Archives/Public/www-archive/2012May/0035.html
> 
> makes clearer an important issue:  Are we talking about logic or are
> we talking about communication?
> 
> I think I need to insist that if you are going to discuss "UDDP", that
> the "P" is significant: it is a protocol first, a computational method
> perhaps, but the results cannot be treated as "logic", because
> associations are generally _not_ persistent, uniform over time. 
> 
> You can't switch back and forth from a communication model to a logic
> model and back again within the same analysis.

We're talking about web architecture and protocols.  As far as I'm
concerned, a logic model is irrelevant.  The architecture *does* need to
have the appropriate mechanisms (protocols, etc.) to support the
semantic web, but there is no need for it to involve a logic model, even
if logic happens to be used to explain some of the technologies that the
semantic web uses.  After all, if a finite state machine were used to
explain the XSLT language that would not automatically mean that we
should try to explain the web architecture in terms of a finite state
machine.
> 
> It isn't "misleading" to characterize what's going on as
> "communication", in fact it is essential, if you're going to have a
> protocol somewhere in the middle.

Certain parts of the key use case certainly do involve one-way
communication.  But it's important to keep in mind that communication is
*not* the main point of the central use case.  The main point is the
ability to sensibly *merge* independently created data.  I am just
concerned that you may lose sight of the central use case if you are
focused on the communication aspect.
> 
> The fact that communication sometimes fails isn't "something bad" --
> it's essential to modeling a communication protocol to understand how
> communication fails, and the fact that sometimes people publish things
> that are useless or total garbage is an important factor in creating a
> model of the semantic web.

So in this central use case we have five communications happening, all
of them basically one-way:

  - Communication 1: Arthur (an RDF author) dereferences Owen's (the URI
owner) URI to find Owen's URI definition.  Based on that communication,
Arthur writes and publishes some RDF.

  - Communication 2: Aster (another RDF author) dereferences Owen's (the
URI owner) URI to find Owen's URI definition.  Based on that
communication, Aster writes and publishes some RDF.

  - Communication 3: Connie (an RDF consumer) downloads Arthur's RDF.

  - Communication 4: Connie also downloads Aster's RDF.

  - Communication 5: Connie dereferences Owen's (the URI owner) URI to
find Owen's URI definition.

Communications #3 and #4 are routine web use.  There is nothing special
about them except perhaps that the Content-Type happens to be some kind
of RDF, so I don't think it's worth analyzing what happens if those
communications fail.

The UDDP convention is involved in communications #1, #2 and #5.  It is
this protocol that allows Arthur, Aster and Connie to all efficiently
obtain the same URI definitions even though they are acting completely
independently.  The overall goal is that these three parties will
succeed in obtaining the *same* URI definitions, or at least URI
definitions that are similar enough that when Arthur and Aster's data is
merged it is still useful to Connie.

There are several ways these communications can fail.  Perhaps the most
obvious is that Owen may change his URI definition in between the times
that Arthur, Aster and Connie download it.  But if we observe from
Connie's perspective what happens if any of these communications fails,
then we can see that it doesn't really matter how these communications
might fail, because the net effect from Connie's perspective is the same
as if Arthur, Aster and/or Owen had published bad or useless data.
Connie may be disappointed that one or more of these other parties had
published garbage (or so it appeared to Connie), but that's okay,
because that's the way the web is.  People are allowed to publish
garbage, and its up to the marketplace to ignore the garbage and laud
the good stuff.  The architecture all works just fine, whether it's
garbage or good stuff.

> The essential invention of the web was not "hyperlinks" (which we had
> for a decade before the web), but rather "404 not found" -- the idea
> that different authors could publish material that linked to the work
> of other authors, without coordination, and that failure was an
> anticipated and managed process.

Yes.  Plus the URI: a single naming space that spanned all of the
popular protocols.
> 
> I think the semantic web also needs a model that accounts for
> robustness.

Right, the central architectural approach to that robustness is to rely
on the marketplace to reward the good and ignore the bad.  In the
semantic web this translates into rewarding URI owners and RDF authors
who follow the UDDP conventions and publish stable, quality URI
definitions and data, and ignoring the noise introduced by those who
either do not follow the protocol or who publish junk.

Key take-aways:

  - Standardizing the UDDP convention -- whether de facto or de jure --
is central to making this work.

  - It does not have to work all the time to be useful.

  - The marketplace will help sort out those who play nicely (by
following these conventions and publishing quality data) from those who
don't.

Does this all make sense to you, or have I missed what you were getting
at?

David
> 
> (David also made some other notes in his reply to me that "the
> protocol is not determined by the language of the message" which I'll
> reply to independently.)
> 
> Larry
> 

-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.
Received on Friday, 25 May 2012 02:29:13 UTC