RE: 200 response as conclusive evidence of an information resource

Hi Jonathan,

> From: Jonathan Rees [mailto:jar@creativecommons.org]
>
> On Tue, Dec 2, 2008 at 5:40 PM, Booth, David (HP Software - Boston)
> <dbooth@hp.com> wrote:
> >> From: Jonathan Rees [mailto:jar@creativecommons.org]
> >
> > If you preferentially view the URI owner's other
> declarations about the resource as being intrinsically more
> reliable than the server's response code, then I can see how
> you would interpret the httpRange-14 finding this way.
>
> I do.
>
> > But I don't see a justification for treating the response
> code as less reliable than the owner's other statements, and
> as I pointed out in
> > http://lists.w3.org/Archives/Public/public-awwsw/2008Dec/0000.html
>
> I explained this in my message. Just because you want to interpret 200
> this way doesn't mean either the sender of the request of the receiver
> of the response wants to be, or can be, held to such an
> interpretation.

I'm not sure what you mean here.  Do you mean:

(a) Some may not consider an HTTP response code to be a form of speech, presumably made by the URI owner?  If someone transmits state secrets based on HTTP response codes, I am quite confident that the courts would consider it a form of speech.  It is certainly possible that the actual response code emitted may be different than what the URI owner intended, through accident or someone else's malice, but that is conceptually no different than what may happen in any communication medium.  So if this is what you meant, again I don't see the justification.

Or do you mean:

(b) The HTTP response code *is* a form of speech, but the particular inference "200 => IR" is not justified because the thing that yielded the 200 response does not match the current AWWW definition of IR, and the URI owner did not agree to that inference rule (and may have no knowledge of it).

If you meant (b), then the difference in how we're looking at this is due to your adherence to the current AWWW definition of IR.  If you relax (or adjust) the definition of IR to be "things that can yield AWWW:Representations", and if you agree that a 200 response *is* an AWWW:Representation, then there is no way that a reasonable person could disagree with the "200 => IR" rule: it is a tautology.

The definition of IR that I have been proposing is simply a more mathematical way to capture the idea that an IR is something that can yield a AWWW:Representation, by modeling it as a function from (Time x Requests) to AWWW:Representations.  It is not the only way to describe or model this concept, but it does have the advantage of being relatively crisp.

>
> > I think there are good, practical reasons for treating the
> 2xx response as irrefutable evidence of an AWWW information resource.
>
> I am not convinced. We should take this up separately.
>
> > I would characterize this part of the httpRange-14 finding as:
> > "Please don't deliver a 2xx response, because a 2xx
> > response means that the resource *is* an AWWW information
> > resource.  Hence, delivering a 2xx response while also
> > claiming that the resource is something other than an AWWW
> > information resource would cause a URI collision
> > http://www.w3.org/TR/webarch/#URI-collision
> > which should be avoided."
>
> I don't see how you can extract this from the resolution. As I said it
> requires a long inference chain and some semantic slight of hand.  This
> is what you want it to say, but not what it says or implies.

Well, the httpRange-14 decision specifically says:
http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039
[[
If an "http" resource responds to a GET request with a
      2xx response, then the resource identified by that URI
      is an information resource;
]]
so deconstructing the paragraph that I wrote above, the only significant part I can see that is not *directly* supported by the httpRange-14 decision is the statement:

    "Hence, delivering a 2xx response while also claiming that the resource
    is something other than an AWWW information resource would cause a
    URI collision http://www.w3.org/TR/webarch/#URI-collision which should
    be avoided."

And you're right that that is not extracted from the httpRange-14 decision, it is merely a reminder of existing guidance in the AWWW.  The inference chain that makes it relevant is:

 - If a resource has yielded a 200 response, then by the httpRange-14 decision, it *is* an "information resource".

 - The resource is an IR, but the URI owner claims (for example) that it is a dog, and uses the URI to denote the dog.

 - Per the AWWW, dogs are not IRs.

 - Therefore, the same URI is being used to denote both an IR and a dog.

 - Therefore AWWW section 2.2.1 on URI collision applies.

The first step in this chain is extracted *directly* from the httpRange-14 decision, and the others don't seem like "semantic slights of hand" to me, nor do I see this as a particularly long or tenuous inference chain.  Where is the discrepency?

>
> >> and nothing could
> >> justify imputing semantics to their 200 response.
> >
> > Sure.  We would be justified in imputing the semantics of
> the httpRange-14 rule: that their 200 response implies an
> AWWW IR, even though they claim the URI denotes a concept.
>
> a property, actually. concepts are a different kind of thing.
>
> Again, we disagree about whether someone not aware of the httpRange-14
> rule would agree with you. You're holding them to a contract they
> didn't sign.
>
> > Both are correct: their current server configuration causes
> > a URI collision, so they should fix their server to comply
> > with the AWWW's advice not to cause URI collisions.  I.e.,
> > what such a Dublin Core URI denotes is ambiguous: it denotes
> > both an AWWW information resource *and* it denotes the
> > concept that the Dublin Core spec says it denotes.
>
> Right, this is just the ambiguity mentioned in the rule (one binding
> in one interpretation, another binding in another).
>
> > BTW, I discussed this in "Splitting Identities in Semantic
> > Web Archtitecture":
> > http://dbooth.org/2007/splitting/#collision
> > [[
> > Multiple URI declarations and URI collision
> > The Architecture of the World Wide Web defines URI
> collision as "Using the same URI to directly identify
> different resources".  URI collision may occur if a URI has
> more than one URI declaration.  However, different
> declarations of a URI do not necessarily cause URI collision,
> because the constraints they express could be equivalent even
> though they are written differently.
> >
> > How should multiple URI declarations for a URI be
> interpreted?  If one has a way to preferentially select one
> over another -- perhaps one is more recent (thus implicitly
> obsoleting others), or perhaps the evidence of the act of
> declaration is more compelling for one than another, or
> perhaps one can determine which URI declaration was intended
> when a statement author made a statement using the URI (see
> slide 2 at http://dbooth.org/2008/irsw/slides.ppt ) -- then
> it probably makes the most sense to use that URI declaration
> to interpret the meaning of the URI in an RDF statement.
> Otherwise, one could think of the complete URI declaration
> for the URI as consisting of the disjunction of the
> individual URI declarations.
> > ]]
>
> I don't think you need to invent a new theory to account for URI
> binding conflicts. The RDF semantics and Pat's other writings give a
> satisfactory treatment. Each model does, indeed, assign only a single
> resource to each defined URI. Different models are OK and inevitable
> since there is nothing you can say that unambiguously nails down a
> referent. If you have two definitions that are inconsistent with one
> another (a collision), however, then you're in trouble because no
> model satisfies both.  So if you have a theory with one definition (set
> of axioms for the URI), and I have a theory with an inconsistent one,
> we are going to fight about what's true and what's not - and unless we
> look carefully we may never even know why we are disagreeing.

No, you're *not* in trouble just because you have a URI collision (a/k/a inconsistent definitions, a/k/a inconsistent URI declarations, a/k/a ambiguity).

The whole point of my discussion of owl:sameAs on slides 15-18 of
http://dbooth.org/2008/irsw/slides.ppt
at last year's ESWC workshop on identity was that such conflicts should *not* necessarily be viewed as "one party is right and the other is wrong": they can be viewed as addressing different application needs.  Such conflicts or ambiguities are inevitable and they are not necessarily bad.  There are ways that we can explicitly model them, talk about them and deal with them, and we need to get used to doing so.  AFAIK this is not covered in the existing AWWW or RDF semantics documents, but it is important to semantic web architecture.  Developing a clear vocabulary and semantics for this will also allow us to have ontologies and ontology versions that explicitly say "this term definition is broader than that term definition" without erroneously saying "this *person* is broader than that *person*".

I don't know if the class decl:UriDeclaration
http://dbooth.org/2007/uri-decl/20081126.htm#UriDeclaration
and the properties s:isBroaderThan and s:isNarrowerThan
http://dbooth.org/2007/splitting/#isBroaderThan
that I've described will turn out to be the most convenient way to represent these things in RDF -- that will be determined by experience -- but they are at least a starting point for discussion.

>
> A lot of the nonsense we're dealing with has to do with the mistaken
> notion that reference is objective. If you account for the
> interpretation step, as do RDF and OWL semantics, things get much
> better.

But don't throw the baby out with the bath.  Yes, what I've been calling step two of the mapping from a URI to a resource
(see slides 5-8 of http://dbooth.org/2008/irsw/slides.ppt or see
http://dbooth.org/2007/uri-decl/20081126.htm#two-step ) is subjective, but part of the point of viewing URI-resource denotation as a two-step mapping is that the first step should *not* be subjective: it should be clearly defined in semantic web architecture, though it will always have some subjective aspects.

>
> >> This is not a criticism of the rule - I think the rule is a good
> >> thing, for the reason stated (lowering the risk of
> >> misinterpretation),
> >> and would like people to follow it. But when I connect to
> >> a server and
> >> look at status codes, I can't count on the server (or those that
> >> control it) being aware of the rule. If you know ahead of
> >> time that a
> >> server has chosen to follow the rule, or does so by
> >> accident, then I
> >> agree that its 200s imply something about the nature of
> >> the resource.
> >> But lacking that I don't. Just because you can't count on everyone
> >> following it doesn't mean it's a bad rule.
> >
> > You seem to be assuming that a URI can only denote one resource.
>
> See above. This is true in any interpretation. If you mean it can
> denote different resources in different interpretations, that's fine,
> and is only a problem if there is a way to detect the difference and
> you want to combine the theories.
>
> > Clearly that is the *intent*: as the AWWW says,
> > http://www.w3.org/TR/webarch/#URI-collision
> > "By design, a URI identifies one resource".  But the whole
> > idea of URI collision is that this is *not* always the case:
> > sometimes a URI denotes *more* than one resource, i.e.,
> > sometimes the mapping from URI to resource is ambiguous,
> > either because a single URI declaration was ambiguous or
> > because there was more than one URI declaration for it.
>
> Hidden in all of this "x identifies y" rhetoric is the idea that we
> should strive for a common model by attempting to get as much
> consistency as we can (without sacrificing the ability to say what
> needs to be said). I think
> this is a good ideal, even if it is unscalable and unachievable. I
> don't object generally to "x identifies y" (except
> that it should be "x names y") because I take it to be code for
> striving for consistency. The illusion of objectivity is a nice
> shorthand, but whenever you find an issue around what is bound to
> what, you've got to deconstruct and go back to something like RDF or
> OWL semantics.

Yes, except that RDF semantics does not say how to talk about URI collision or how to indicate that one document declared a URI one way and another declared it in a conflicting way and what to do about that when you want to use both documents together.

>
> >> A second problem is the dissonance between RFC 2616 and
> >> AWWW. You have
> >> to do serious semantic gymnastics to make these align. I
> >> would forgive
> >> anyone for saying (perhaps in RDF) that their network service, for
> >> which a 200 is legitimately delivered according to RFC
> >> 2616, is NOT an
> >> information resource in the AWWW sense.
> >
> > Well, if you are using the current (flawed) definition in AWWW,
>
> I'm sorry? How can I be heard as saying anything else? The AWWW I'm
> talking about is a static document.
>
> > then I would also, because the current definition does not
> > adequately cover the full variety of what can legitimately
> > yield a 200.  That's why I've been advocating a definition of
> > "information resource" as, essentially, a function from (Time
> > x Requests) to Representations.
>
> Our hypothetical RFC 2616 fossil have the same problem with
> these functions as
> they would with AWWW IRs. They just don't seem to qualify as
> "network data objects or services". A service is not a function.

Sure it is.  Modeling an IR as a function is just a mathematical abstraction of the notion of "network data objects or services": it gets inputs and it produces outputs, and the outputs may depend both on the input and the current time.

>
> > No, the problem is that that definition of "information
> > resource" is wrong.  Try using the definition of ftrr:IR that
> > I proposed and it should make more sense:
> > http://lists.w3.org/Archives/Public/public-awwsw/2008Apr/0046.html
>
> See above... I think that the only rescue is to make
> a superclass containing both "network data object or service"
> and AWWW IR, and make that be the domain of the
> has-representation relation.

That's essentially what ftrr:IR is:
http://lists.w3.org/Archives/Public/public-awwsw/2008Apr/0046.html

>
> >> Or maybe it's obvious that RFC 2616 should be
> >> ignored, or altered, where it is dissonant with AWWW (new
> >> covenant??)
> >> - maybe replace its definition of "resource" with a more modern one
> >> such as RDF's.
> >
> > No, one just needs to realize that, for historical reasons,
> > when RFC 2616 says "resource", it means what AWWW calls
> > "information resource" (except that the current AWWW
> > definition needs correction).
>
> This is absurd!  What evidence do you have for this?  Or are you
> saying that RFC 2616 means what *you* call an information resource?

Yes, that's what I meant.  That's why I wrote "(except that the current AWWW definition needs correction)".  Basically, I think the authors of AWWW were on the right track in defining IR, but the language they ended up with just didn't cover all of the cases.

> This is also unjustified - just something you would like to be true.

Well, there just happens to be a very strong resemblance between what RFC2616 calls "resource", what I called ftrr:IR
http://lists.w3.org/Archives/Public/public-awwsw/2008Apr/0046.html ,
what Roy Fielding called "resource"
http://www.ics.uci.edu/~fielding/pubs/webarch_icse2000.pdf
(which as I pointed out in
http://lists.w3.org/Archives/Public/public-awwsw/2008Apr/0047.html
is basically a curried version of what I called ftrr:IR)
and what AWWW calls "information resource" (*except* for some missing cases): they are all things that can have "representations".

>
> The proof would be in the ontology, if we were ever to write one.
>
> >
> >> Or maybe AWWW can be put aside or altered: redefine
> >> "information resource" to mean not what the glossary says
> >> but what RFC
> >> 2616 means by "resource" (or some superclass of it).
> >
> > Yes!  The AWWW definition of IR needs to be corrected.  And
> > while we're at it, it would be good to point out that it
> > corresponds to what RFC2616 calls simply "resource".
>
> This would work except that I don't think Tim or the other authors of
> AWWW would go for it (being Talmudic myself now)... and since I
> think I understand how one comes to this view, I am not sure
> I would agree either. This a definite point of non-consensus.
>
> (should we take a survey on some of these questions?)

Sure.  One question I suggest:

1. Is the current AWWW definition of "information resource" adequate to support the "200 => information resource" rule given in the httpRange-14 decision?

>
> > I agree that in modeling HTTP interactions, there is not
> > much need to talk about IRs.  However, this discussion about
> > whether a URI that yields a 200 response can denote a non-IR
> > does bring out some important issues of how semantic web
> > architecture should work, and in particular, how a URI
> > denotes a resource.
>
> URI ownership seems a good enough story to me - it says that what the
> URI owner says, goes.  Not that you should necessarily believe their
> RDF,  but you should listen and respect if you can. So this is my
> answer: you talk to the URI owner somehow to get RDF (or prose, as a
> substitute) that constrains permissible models, consider the
> consequences of believing it, consider whether they're being
> consistent with governing documents (such as RFCs and W3C recs), and
> believe it, if you are willing to risk it.

That's a little too vague to know whether we agree or disagree, but in the least it sounds like we have a lot of common ground.

> If you want to invent a
> protocol for making such communication systematic - effectively what
> you have done with URI declarations and what I'm doing with the Link
> header

They're different categories: the idea of URI declarations is not to invent any new protocol.  It is about clarifying how semantic web architecture should work.

>  - that's not a bad idea, but we don't yet have any such
> protocol described anyplace other than private accounts (where there
> are many) - no recommendation, no RFC, no finding. So I think this is
> a subroutine that is outside of our current task.


THanks for your comments!



David Booth, Ph.D.
HP Software
+1 617 629 8881 office  |  dbooth@hp.com
http://www.hp.com/go/software

Statements made herein represent the views of the author and do not necessarily represent the official views of HP unless explicitly so stated.

Received on Thursday, 4 December 2008 04:34:57 UTC