RE: Yorick Wilks on Semantic Web & httpRange-14 from David Booth on 2012-05-20 (www-archive@w3.org from May 2012)

From: David Booth <david@dbooth.org>
Date: Sun, 20 May 2012 10:19:25 -0400
To: Larry Masinter <masinter@adobe.com>
Cc: Henry Story <henry.story@bblfish.net>, www-archive <www-archive@w3.org>
Message-ID: <1337523565.2232.95981.camel@dbooth-laptop>
Hi Larry,

There are a few problems with the way you've described it.  I'll address
specifics inline below, but in a general sense, one of the problems is
that you've described it as a communication system, and that's a bit
misleading -- not exactly wrong factually, but it conveys the wrong
intent.  

First of all, it is not two-way communication.  The communications that
do occur are all one-way.  Second, the point is not the communication,
it is the ability to *combine* the independently created data because
that data used was based on the same definitions.

Also, with "communication" usually something bad will happen if that
communication is misunderstood.  But that isn't what's going on in this
case.  On the web people publish information all the time that may or
may not be useful, useless, quality or total garbage.  If someone
doesn't like data that they find somewhere on the web, they'll ignore it
(and maybe post a blog entry about what garbage that site produced).
This is a very different expectation than one normally associates with
"communication".  So again, I think it's a bit misleading to
characterize it as communication.

On Sat, 2012-05-19 at 12:57 -0700, Larry Masinter wrote:
> OK, let me try this out as a gloss -- does this work for you?
> 
> We've devised a communication system, call it "SW14" ("Semantic Web
> following httprange-14").
> 
> In this system, there is a computational mapping called UDDP which,
> given a URI x, UDDP(x) discovers the owner of x, and asks the owner of
> x the definition of x, and returns that definition.

Yes, I think that's close enough.  UDDP(x) asks the owner's server for
the definition of x, and -- per the UDDP conventions -- the owner's
server is presumed to speak on behalf of the owner.
> 
> A and B are communicating with C. A decides to use URI X, because A
> likes the definition UDDP(X).  

Okay.  I'm assuming that A and B are RDF statement authors and C is an
RDF statement consumer who wishes to combine the statements from A and
B.

> B decides to use URI X, and you want B to also agree that UDDP(X) is
> what B wants to say.

Whether or not B follows the UDDP convention is B's choice.  But if the
UDDP convention becomes a W3C Recommendation (for example) and becomes
well established in the semantic web community, then there will be
social pressure to follow it.  If B does not follow it, then B will be
viewed as providing lower quality data -- because it does not play well
with others -- and the marketplace will make B's data less popular. 

> A sends message M to C, where M uses X as a term.
> B sends message N to C, where N uses X also.
> 
> Now C can use UDDP(X) to discover what A and B meant by X, if C
> cares.  

Yes.

> Lots of the time C doesn't care about any particular terms, so it's
> not reasonable to include UDDP(X) in the messages M or N, and the
> definitions might even be large and complicated.

I'm not following you here.  If C didn't care about the terms in M and
N, then why would C have downloaded them?   Presumably C downloaded M
and N because that data used terms that C cared about.  Can you clarify?
> 
> Now, some people might think this system is overly complicated, there
> are disagreements about what UDDP should be. Currently SW14  uses
> UDDP14, which says that either X has a fragment identifier, and
> UDDP14(X) works by getting X without the fragment and then looking in
> the 200 response, or else X has no fragment identifier and there's a
> 303 response which is retrieved to get UDDP14(X).
> 
> There's some arguments about changing SW14, and making a new
> convention, and that's what all these choices are about.
> 
> Some groups are using a different system (SW15, SW16, ...) with
> different conventions, usually with a different proposals for UDDP.
> 
> And you're hoping the TAG will resolve the differences between these
> groups and lead them all to the one true UDDP.

Not exactly one *true* UDDP, but one convention, yes.  Because there is
increased value in having many people using the *same* convention, due
to the network effect.  
http://en.wikipedia.org/wiki/Network_effect  

> But in fact, web architecture doesn't really define this, 

Right, this has not been clearly standardized yet by the W3C.  That's
why a standard like UDDP is needed.

> doesn't restrict the languages that can be used. The languages used in
> M and N and the contexts for X within M and N determine the UDDP to
> apply to X to get its definition.  

No, that is inside out.  The protocol is not determined from the
language of the message.  By following the protocol, you can determine
the language of the message.  It's like the HTTP protocol: you *first*
follow the HTTP protocol.  Once you have done that, you can figure out
the language of the message by looking at the Content-Type header.  You
don't determine the protocol from the Content-Type of the message.

> There are SW14 languages which use UDDP14, but other languages which
> use some other protocol or method. 

No, that's inverted, as explained above.
> 
> There is no "true" UDDP, nothing to "discover" here.  

UDDP is merely a convention.  To a first approximation the details of
that convention do not matter, as long as *something* becomes a common
convention, so that the benefits of the network effect can be obtained.
But looking beyond a first approximation, if we are evaluating different
candidate conventions, some will have better properties than others, so
it's a good idea to figure out what convention would be best before
adopting it as a W3C standard.

> If C gets two messages, M and N, both using URI X, C can only
> reasonably assume that X is used the same way  by A and B if the
> languages of their messages M and N agree on the UDDP convention.  

No, again that's inverted.
> 
> "httpRange-14" as a TAG finding fails because it doesn't establish the
> scope, and doesn't acknowledge that there are many legitimate
> languages in use where the meaning of using a URI within them doesn't
> follow UDDP14.

It sounds like you are pointing out that: (a) a URI can mean different
things in different languages; and (b) not everyone will follow UDDP14.
Regarding (a), it's totally fine if different meaning is used in
different languages, provided that one knows what language is used, and
that's the purpose of the mechanisms like the Content-Type header, XML
namespaces, etc.  Regarding (b), it's okay (though not ideal) if not
everyone follows UDDP14, because the marketplace will reward those RDF
statement authors who play well together (by following a common UDDP14
convention) and shun those who don't.  This is conceptually no different
than some publishers producing higher quality more valuable data than
others.  The web architecture is designed to withstand this kind of
diversity.

There *is* an issue with UDDP in that it does not (currently) use an
explicit protocol indicator, as discussed in section 4.3;
http://www.w3.org/wiki/UriDefinitionDiscoveryProtocol#4.3_Incompatible_URI_definition_discovery_protocols
Hence, it is possible that an RDF statement consumer could wrongly
assume that an RDF author had used the UDDP convention when in fact the
author did not, and this may cause the statement consumer to
misunderstand that author's data.  However, this is not a major problem,
because the net effect of this misunderstanding is the same as if the
author had published garbage data.  And again, the marketplace will
reward those who follow the protocol and publish good data and shun
those who don't.


-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.
Received on Sunday, 20 May 2012 14:19:59 UTC