Re: Comments on "Interoperability of referential uses of hashless URIs"

On Mon, Oct 24, 2011 at 6:15 PM, David Booth <david@dbooth.org> wrote:
> On Mon, 2011-10-24 at 11:06 -0400, Jonathan Rees wrote:
>> This 'ambiguity' thing is a red herring.
>
> I disagree.  As I explain below, I think it is central to a proper
> understanding of the problem and the appropriate solutions.
>
>> I know Harry and Pat
>> introduced it to make a particular point about fallacies in the theory
>> of web architecture, and they were right. But that doesn't mean you
>> can use it as a sledgehammer to attack any communication mechanism you
>> don't like.
>
> I'm not doing that at all.  I am an ardent supporter of the web, web
> architecture, and semantic web technology.  I'm pointing out that the
> problem is being subtly misframed, and this may lead to wrong
> conclusions about how it should be solved.
>
>>
>> The point is, there's a job to be done, and communication is necessary
>> in order to accomplish it. You need to communicate enough information
>> to do what's needed, and you need for the receiver to understand the
>> sender.
>
> Right, *within* the class of applications that the sender is attempting
> to support.

This has nothing to do with applications, except inasmuch as senders
and receivers are running them, which is an implementation detail that
is best ignored. The meaning of a message comes from prior agreement
between sender and receiver. In a standards-based world that agreement
is articulated in standards, although perfectly useful communication
happens when agreement is reached in other ways.

If you insist on talking about applications, I might say that the
application is the Web, but I don't think that would be useful.

> The sender *cannot* support all possible applications.

Straw man.

> In
> other words, there isn't *a* job to be done -- there are *many*
> different jobs to be done.  The architecture needs to support the
> ability for all those jobs to be done -- not the ability for *one*
> sender's data to be understood by *all* receiver applications.

Straw man.  If the meaning of a document is determined by the SVG
specification, then senders and receivers have all the agreement they
need to do a wide variety of things (many jobs) with SVG documents.
The meaning comes from the spec.

In the 'referential use' memo agents agreeing to S2 can do many things
having made that agreement, and similarly D2. That's the whole point
of writing these things down.

>> If not enough information is sent for some job, send more. If
>> information isn't needed, you don't have to send it. If the receiver
>> doesn't understand what the sender is saying, you need coordination,
>> e.g. prior discussion, or a standard, or something.
>>
>> Roughly speaking, ambiguity is the reciprocal of information.
>> Obviously there is no such thing as "completely unambiguous" - that
>> would be the same as "total information" which is ridiculous.
>
> I disagree.  That would be true if we were talking about some universal
> quality of ambiguity/unambiguity.

Are you saying there are many ambiguity qualities? I agree that a
message might be considered unambiguous one day, and ambiguous the
next, as a result of the receiving agent's need for information
increasing from one day to the next. Given agreement on meaning,
ambiguity is relative to the receiver's needs.

> But if we ask about a particular
> application class, then it certainly *is* possible to be "completely
> unambiguous".  And by necessity we *must* restrict our scope to
> particular application classes, because it is not possible for a URI
> owner to consistently and unambiguously address all applications.
> Ambiguity/unambiguity are *relative* to an application class.

Ambiguity is a function of the message, its meaning (i.e. the spec for
the language it's written in), and your need for information. Two
agents trying to extract the same information from a message that
doesn't provide it are both going to experience the message as
ambiguous, assuming they've understood it in the same way (authorized
by spec presumably). So I agree, it's relative, but it's not relative
to "application class".

>> Any
>> message has the potential to be incomplete, or not understood, in some
>> context. That is, a communication protocol might work perfectly fine
>> for a while, and then two communicating parties might say: Hey, this
>> won't do, we need more clarity / information about this. This just an
>> inadequacy that should be fixed somehow. We don't say: Oh no, totally
>> unambiguous communication is impossible, so we have to just give up
>> all hope of communicating, live with what have, drop the idea of
>> correctness and accountability, and so on.
>
> I'm not saying we should give up on communicating -- quite the opposite
> -- but we *do* need to drop the idea of *universal* correctness, since
> correctness is *relative* to the application class.

Straw man. I'm not talking about universal correctness, I'm talking
about the kind of correctness an engineer means when saying that an
agent is using the SVG spec correctly. Correctness is relative to a
specification, or some other kind of standard or norm.

>  An example that I
> have often used is map data that models the world as flat.  Such data is
> fine for applications such as street navigation -- in fact it is
> *better* than 3D data for those apps, because it is cheaper to implement
> and process -- but it would be totally inadequate for other applications
> that care about altitudes and the curvature of the earth.  The notion of
> universal data correctness is a red herring.  It is better to think of
> it in terms of data *usefulness* to applications.

Meaning comes from specs (or other kinds of precoordination), not
applications. An application can do whatever as like, so long as it
lives up to the agreements into which it (or rather its ag-ee or
creator) has entered, such as spec conformance.

There's no way to test whether a spec has modeled the world in some
way except by its actions. 'Wrong' models (and they all are, that's
what makes them models) used internally are nobody's business - they
are fine so long as they don't lead to wrong answers as observed
externally.

> My point is that if we do not frame the problem properly, we are apt to
> draw the wrong conclusions about how it should be solved.

Right, that's why I have taken great care to frame the problem properly.

>>
>> All format and protocol specs evolve in the direction of increasing
>> the richness and success of communication. (Well ideally at least...)
>> That's all we're talking about here. We can decide that any given
>> communication problem, such as metadata, should be solved by someone
>> else; that doesn't make the problem go away.
>>
>> A large class of metadata expression problems (such as the licensing
>> one) have a fairly simple solution that Tim has been advocating for
>> almost twenty years. We have a solution to a problem. It has some
>> warts and some opposition. If we decide the TAG shouldn't be involved
>> in fixing it, that's fine. It will just get pushed out to a different
>> forum and solved by them in some way of their choosing.
>>
>> Of course there are special cases where D2 and S2 are not mutually
>> exclusive.
>
> Actually I think it is the other way around: there are special cases
> where D2 and S2 cause harmful ambiguity -- such as the CC license case
> -- but, as Ian Davis and other LOD protagonists of S2 point out, in the
> vast majority of cases the ambiguity is harmless.

The CC license case has little to do with ambiguity. It's about
whether the license is applied to the correct resource. In the Flickr
case, if the standard or prior agreement is S2 then the receiver gets
the answer intended by the sender. If the agreement is D2 then Flickr
has made a mistake by sending the wrong message. If there is
disagreement on what is 'right', then there is no way to tell who is
'right' and who is 'wrong' since there is no standard (i.e. agreement)
to judge against.

If the sender and receiver agree that the message is ambiguous (i.e.
does not provide the information needed in this situation), then in
this case, since the assignment matters, the receiver would have to
ignore the message and use a different channel (or language) for
obtaining the information they need - that is their only correct
response. But that is not the situation I was talking about.
Disagreement, or mistakes concerning what is understood, is not at all
the same as ambiguity, which is simply an understanding that there is
information that is not conveyed.

>> If John Smith Jr. and John Smith Sr. live at 5 Ambiguity
>> Lane, then there is no problem if I say that John Smith lives at 5
>> Ambiguity Lane, since it's true under any interpretation of "John
>> Smith". But for some other situations it won't work for me to just say
>> "John Smith" - like if I say that John Smith is 44 years old. Neither
>> particular situation is more "general" than the other, but a
>> communication system that lets you coordinate more meanings and make
>> finer distinctions is generally more "general" or useful than one that
>> does not.
>
> The architecture needs to support the ability to convey distinctions
> along *any* axis, and there is a virtually infinite number of potential
> axes.  But this doesn't mean that a system with built-in support for one
> particular axis (e.g. the web page vs. its primary subject) is more
> general than a system that also *allows* distinctions on that axis, but
> does not give special recognition to that axis over any other axis.

Straw man. Specifications can help make all sorts of distinctions, in
any direction they like. If a spec says that a message means X (e.g. X
makes a certain distinction), and the communicating parties have
agreed to that spec, then the message means X, in their communication.
Similarly, a spec can intentionally make a message not mean X (which
is not the same as meaning not X). In that case neither party can
justifiably conclude X from the message. If they do then they are
mistaken.

>> The "John Smith" situation holds for any "identifier" system
>> - that was the point of the Hayes/Halpin paper. All it says is that
>> you need to engineer your communication system so that what has to be
>> said, is said, and is understood.
>
> Yes . . . *within* the intended class of applications (which may be
> quite broad, but are *not* universal).

The spec defines the class of agents that are conformant to the spec,
not the other way around.

>>
>> So I stand by what I say, and I continue to think it's obvious. D2 and
>> S2 are incompatible because, as general methods, in most cases they
>> give mutually inconsistent answers.
>
> If "in most cases they give mutually inconsistent answers" means "in
> most cases there exists *some* application for which they give mutually
> inconsistent answers", then I would agree.  But if it means "in most
> existing applications they give mutually inconsistent answers" then I do
> not think that is true at all.  As the LOD community demonstrates, most
> of their applications work fine in spite of the potential ambiguity that
> S2 creates when used with D2.

Specs for the meaning of a set of messages are incompatible if there
is any message for which the meanings given by the two are
incompatible. That is what I meant by incompatible specs. You can't
conform to both at the same time without the potential for mistake.

You can use incompatible language specs at the same time if you are
careful to stay away from conflicting messages. As you say, you might
even get quite a bit of useful work done this way. That does not mean
the specs are compatible, just that they're partially compatible.

>> The fact that occasionally they
>> don't, or equivalently that the incompatibilities happen to not matter
>> to some particular sender or receiver, doesn't affect the truth of the
>> statement that they're incompatible as general methods.
>
> Perhaps we need to get some quantitative data on how often this
> ambiguity matters to an application, because my perception is that the
> dominant case is the other way around: usually this ambiguity does *not*
> matter, but occasionally it does -- as nicely illustrated by the CC
> licensing use case.

Quantitative deployment studies would have no bearing on an assessment
of the correctness of what I have said, since what I said is not
sensitive to properties of the installed base.

The document is neutral regarding any possible TAG decision to invest
in building consensus on any particular proposal. It just says D2
senders are incompatible with S2 receivers for some messages, and vice
versa (and then has a bit of discussion). I think it's very important
that we all get on the same page regarding the nature of the problem.
Any process for determining next steps by comparing or synthesizing
the two approaches is future work and is pointless without some
neutral baseline understanding. In the future deployed base might be
interesting to talk about, but that's not what the document is about.

It would be weird, but conceivable, to agree that some message "ZJW"
means "it is raining and it is not raining". In that case it is
unlikely that conformant agents would ever use this message.

Jonathan

Received on Tuesday, 25 October 2011 21:53:51 UTC