Re: Semantic web architectural requirement [was Re: Squaring the HTTP-range-14 circle]

On Tue, 2011-06-21 at 17:52 -0400, Alan Ruttenberg wrote:
> On Tue, Jun 21, 2011 at 5:06 PM, David Booth <david@dbooth.org> wrote:
> > [Moving this comment to the AWWSW list, as I think it will be more
> > appropriate there.]
> > Following up on:
> > http://lists.w3.org/Archives/Public/public-lod/2011Jun/0362.html
> >
> > On Sat, 2011-06-18 at 23:05 -0500, Pat Hayes wrote:
> >> Really (sorry to keep raining on the parade, but) it is not as simple
> >> as this. Look, it is indeed easy to not bother distinguishing male
> >> from female dogs. One simply talks of dogs without mentioning gender,
> >> and there is a lot that can be said about dogs without getting into
> >> that second topic. But confusing web pages, or documents more
> >> generally, with the things the documents are about, now that does
> >> matter a lot more, simply because it is virtually impossible to say
> >> *anything* about documents-or-things without immediately being clear
> >> which of them - documents or things - one is talking about. And there
> >> is a good reason why this particular confusion is so destructive.
> >> Unlike the dogs-vs-bitches case, the difference between the document
> >> and its topic, the thing, is that one is ABOUT the other. This is not
> >> simply a matter of ignoring some potentially relevant information (the
> >> gender of the dog) because one is temporarily not concerned with it:
> >> it is two different ways of using the very names that are the fabric
> >> of the descriptive representations themselves. It confuses language
> >> with language use, confuses language with meta-language. It is like
> >> saying giraffe has seven letters rather than "giraffe" has seven
> >> letters. Maybe this does not break Web architecture, but it certainly
> >> breaks **semantic** architecture.
> >
> > I don't think that's correct.  AFAICT what's important for the semantic
> > web from an architectural perspective is the following:
> >
> >  The client must be able to use a simple, architecturally
> >  authoritative algorithm to determine, with full fidelity,
> >  the URI owner's formally expressed identity for the resource.
> >
> > To pick this apart and explain what I mean:
> >
> > Why "simple"?  To facilitate widespread uptake.
> >
> > Why "architecturally authoritative"?  So that everyone knows how the
> > architecture is supposed to work.  This is like having an authoritative
> > specification for HTTP: you don't want different people having different
> > ideas about how HTTP is supposed to work.
> >
> > Why "algorithm"?  So that it can be done by a machine.
> >
> 
> Below is where your error is. The fact of the matter is that
> 
> a) current representation languages do not allow us to say all we mean

Agreed, but we can still say *enough* to make useful applications.

> b) most people aren't even skilled enough to say what can be said
> using these languages

Okay, but I don't see how that is relevant.  There is no requirement
that everyone be able to write quality RDF.

> c) even among the people who are skilled enough with the formalism,
> there remain difficult ontological issues that need to be addressed in
> order to effectively communicate formally.

What issues do you mean?

> 
> So there can not be "full fidelity" (in the usual sense of the word)
> without human intermediation somewhere in the system, for a number of
> reasons.

I explained what I meant by "full fidelity", and I clearly excluded
semantics that require human intervention, because the goal is fully
automated machine processing.  Think of "full fidelity" as "having the
same formal entailments".  If the RDF, OWL, etc. specifications cannot
assure us of having the same entailments then they need to be sent back
to their working groups and corrected.

> 
> Suggesting what you suggest below therefore
> 
> a) Encourages miscommunication, because full fidelity communication
> isn't feasible and you encourage people to say anything consistent
> with the published assertions *even if we all know it doesn't make
> sense*.

"Making sense" is irrelevant if it is useful to applications.   Modeling
the world as flat doesn't "make sense" in that the real world clearly is
not flat, but still such data can be useful for *some* applications, and
it may even be *better* for some applications than data that more
accurately describes the world, because it is simpler.

> b) Discourages careful thinking, by suggesting that whatever people
> write makes sense and is adequate and sensible

I have never made that suggestion.  Where did you get that?

> c) Dismisses one of our most powerful tools - our ability to
> interpret, relate what is said to what is known, fix things, and
> evolve our representations accordingly.

By "representations" I assume you mean URI declarations or definitions.
I am not exactly dismissing our ability to interpret and relate what is
said to what is known.  A human or a sophisticated application may still
do those things if it chooses.  But I *am* dismissing it from semantic
web architecture, because semantic web architecture is about enabling
*machine* processing, and global, lossless communication would not be
achieved if it required human interpretation or being related to "what
is known", because those things are subjective.

> d) Along with your URI owner advise "URI owner responsibility 1: When
> minting a URI, the URI owner (or delegate) SHOULD publish a URI
> declaration [Booth2007] at the follow-your-nose (f-y-n) location,
> containing core assertions whose purpose is to constrain the set of
> permissible interpretations [Hayes 2004] for this URI. These core
> assertions SHOULD NOT be changed after their publication." condemns us
> to continue to use the inadequate representations forever.

No, that is a "SHOULD NOT", not a "MUST NOT".  If you *choose* to have
an unstable URI declaration that's fine as long is you publish your
change policy so that RDF authors can decide whether they wish to use
your URIs or not.  Change policy is covered in the fourth bullet of this
section:
http://dbooth.org/2009/lifecycle/#other

> 
> All you say about formal statements should be recast as tutorial about
> how a machine *will* interpret the assertions. The lessons learned
> should be that what you say will affect how reasoner concludes and
> therefore might effect some system you care about. 

No, this is not just advice to application developers.  It is system
design.  It is about designing the semantic web as a whole to have
certain desirable properties.  Different architectural design choices
result in different properties of the system.  

> But the lesson
> should not be that this elevates these to where they are considered
> truth. The assertions you make are in the service of truth, not the
> other way around.

I am not advocating that anything be considered truth.  Truth is
irrelevant here.  This is about system design -- not philosophy.  

David

> 
> -Alan
> 
> > What do I mean by "full fidelity"?  If both the publisher and the client
> > following the architecture and applicable standards then the client will
> > interpret the publisher's statements with the *same* formal semantics
> > that the publisher intended.  However, this does not -- and cannot --
> > extend beyond what is expressed in the machine-processable portion of
> > the statements.  It includes only what is expressed *formally* -- in
> > machine processable statements such as RDF or protocol codes.  It does
> > *not* include the human-oriented semantics of some natural language
> > prose embedded in an rdf:comment.  Note also that "full fidelity" does
> > *not* mean that the referent of a URI can be uniquely determined.
> > Rather, it means that its identity is constrained with the same
> > constraints -- neither more nor fewer.
> >
> > Why the "URI owner"?  Because this provides a deterministic chain of
> > authority.  From AWWW:
> > http://www.w3.org/TR/webarch/#uri-ownership
> > [[
> > URI ownership is a relation between a URI and a social entity, such as a
> > person, organization, or specification. URI ownership gives the relevant
> > social entity certain rights, including:
> >   1. to pass on ownership of some or all owned URIs to another owner—
> > delegation; and
> >   2. to associate a resource with an owned URI—URI allocation.
> > ]]
> >
> > Why "expressed"?  Because we cannot access the intent that the publisher
> > has in his/her head.  We can only use what the publisher actually
> > expressed.
> >
> > Why "*formally* expressed"?  Two reasons: (a) the point is to enable
> > automated machine processing, and machines are not so good at things
> > like natural language processing; and (b) to enable lossless
> > communication.
> >
> >
> > The reason this is important architecturally is that it enables global,
> > lossless communication by machine.  However, this does not *obligate*
> > the publisher to be unambiguous if the publisher chooses to be
> > ambiguous.  (And as we both know, it is *impossible* for the publisher
> > to remove all possible ambiguity anyway: ambiguity is in the eyes of the
> > consuming application.)  Furthermore, it does not obligate the client to
> > compute the publisher's expressed resource identity.  OTOH, the client
> > must not claim to use that it has if it hasn't.
> >
> > Notice that this architectural requirement does *not* imply that
> > publishers must distinguish documents from dogs.  This is why the class
> > of "information resource" does need to be disjoint with the class of
> > dogs or people.  But it *does* imply that publishers be *able* to
> > distinguish documents from dogs (or male dogs from female dogs, etc.) if
> > they *choose* to do so in communicating to their clients.  I.e., if the
> > publisher chooses to make this distinction, it is important that the
> > client be able to determine that the distinction was made.  This is why
> > the httpRange-14 rule about 303 is important.
> >
> >
> > --
> > David Booth, Ph.D.
> > http://dbooth.org/
> >
> > Opinions expressed herein are those of the author and do not necessarily
> > reflect those of his employer.
> >
> >
> >
> 
> 
> 

-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.

Received on Wednesday, 22 June 2011 01:07:12 UTC