Re: Requirements for Any Theory of "Information Resource" from Jonathan Rees on 2011-03-01 (public-awwsw@w3.org from March 2011)

From: Jonathan Rees <jar@creativecommons.org>
Date: Tue, 1 Mar 2011 08:40:13 -0500
To: David Booth <david@dbooth.org>
Cc: AWWSW TF <public-awwsw@w3.org>
Message-ID: <AANLkTikKzdv9=TafP0uGpc0qHjzRnqwhwGbXnhD7Y3EB@mail.gmail.com>
On Mon, Feb 28, 2011 at 2:36 PM, David Booth <david@dbooth.org> wrote:
> I thought I would take this opportunity to provide some feedback on
> Jonathan's draft: Requirements for Any Theory of "Information Resource".
> http://www.w3.org/2001/tag/awwsw/ir-axioms/20110225
>
> 1. Regarding:
> [[
> The challenges to explaining "information resource" are: (1) to make
> "information resources" logically independent of Web dereference (and in
> particular the HTTP protocol), while saying rigorously what it means for
> one of them to be "on the web" at a given URI
> ]]
> Perhaps define an abstract "AGET", which is implemented as GET in the
> HTTP protocol, and in other ways by other protocols.

Not sure what problem this solves or where to work it into the text.
Is there some confusion in the abstract that needs to be straightened
out?

> 2. Regarding:
> [[
> The main contribution of this note is to say that the properties of an
> information resource are those that are invariant over its
> "representations".
> ]]
> That sounds pretty good.  But isn't something like that true of *all*
> resources?  As in "the properties of a resource are those that are
> always true of that resource"?  BTW, those invariants are the assertions
> that should be included in the URI's declaration.

Um, no, the statement isn't trivial. It relates representation
properties to IR properties in a testable manner, and it only applies
to certain properties.

I took a risk that I could leave it as "the properties" here, but you
have shown me that this needs to be qualified. Will change the
wording, how about this?

  The main contribution of this note is to say that many
  important properties of
  an information resource are invariant over its
  "representations".  If one knows such a property of an information
  resource, one can make falsifiable predictions about its retrieved
  "representations", and knowing about the "representations", one can
  conclude properties of the information resource.


> 3. Regarding:
> [[
> 'Has reading' is not functional, because we want to admit
> interpretations where readings vary by media type, language, session,
> time, whim, etc.
> ]]
> But you could make 'has reading' functional: just include time and
> request as arguments, as in ftrr:InformationResource
> http://lists.w3.org/Archives/Public/public-awwsw/2008Apr/0046.html
> Or you could let the request be broken apart into multiple variables.

This would be just one particular interpretation. It is not a
requirement and I see no reason to try to foist that particular model
on everyone.

(I have said this to you dozens of times over the years and just don't
understand why we're not communicating.)

> If we consider a set of variables v1..vn under which 'has reading'
> becomes functional: f(v1, ... vn), then when you talk about metadata
> properties:
> [[
> Let P be a metadata property, let R be an 'information resource', and
> let x be a member of the range of P. Then P(R,x) if and only if P(S,x)
> holds for all readings S of R.
> ]]
> I wonder if there would be a way to
>
> I like the general direction you're going in, when you talk about
> metadata properties:
> [[
> Let P be a metadata property, let R be an 'information resource', and
> let x be a member of the range of P. Then P(R,x) if and only if P(S,x)
> holds for all readings S of R.
> ]]
> However, I'm not certain of the "only if" direction.  The "only if"
> direction seems to be saying that R cannot have any property that is not
> observable through some S.  Maybe that's correct -- I'm not sure.  It
> reminds me of the discussion of intensional versus extensional semantics
> in RDF
> http://www.w3.org/TR/rdf-mt/#glossIntensional

You need one direction in order to be able to write metadata. You need
the other in order to interpret it. The whole framework is vacuous,
inconsequential, if you remove either direction.

> If you think of R as a function f of a set of n variables v1..vn:
> f(v1, ... vn), and c1 is a constant and f1 is a derived function of n-1
> variables v2..vn such that:
>
>  f1(v2, ... vn) = f(c1, v2 ... vn)
>
> then it makes sense to try to relate properties that hold of f to
> properties that hold of f1, but how?  Perhaps it can be done if we
> restrict those properties to ones whose truth is determined solely by
> the observable values of f(v1, ... vn):
>
>  MP is the set of properties with domain 'information resource'
>  such that for any P in MP and any information resource f, there
>  is a corresponding property ("little-p") p, such that: P(f) iff
>  p(f(v1, ... vn)) for all values of v1..vn.
>
> Maybe something like this is what you're getting at?

I'm not getting at that particular model, but that seems like a fine
way to interpret the theory, if it helps.

If you're saying there's a missing axiom then that's something I'd like to know.

> 4. I like this idea:
> [[
> The IR must 'fess up' to things that all of its readings do. This rules
> out pathologies where something is P-related to all of a document's
> translations but not to the IR itself. The practical benefit is that it
> lets you 'gamble' on hypotheses of an IR formed by investigating a
> number of its readings. You're not guaranteed to be right, but you may
> be willing to act on the hypothesis.
> ]]

I think this is the answer to the question you asked about the two
directions of the 'iff'?

> 5. Regarding:
> [[
> (def) An 'information resource' is 'bound to' a URI iff every simple IR
> that is 'authorized for' the URI 'is a reading of' the information
> resource.
>
> This formalizes "on the web".
> ]]
>
> I think that's a pretty good definition of something, but I'm not sure
> it formalizes the notion of an information resource being "on the web",
> since that definition only talks about the binding between a URI and an
> information resource.  Usually when something is "on the web", there are
> representations available via GET.

Hmm. You're right, maybe I should strike the comment as misleading.
For now I just added a disclaimer:

  <p>This formalizes "on the web", at least in part.  (You might
  expect other things to hold as well.)</p>

> But maybe you're using the word "binding" different than I expect.  I
> think of a binding as an association between the URI and the information
> resource, just as an "interpretation" in RDF semantics maps URIs to
> resources.

Nope. It's just a statement about authorization. That induces a
certain relationship, but it has nothing to do with the model theory,
it's related to the problem domain.

Note that the axioms have no URIs in them. There is no attempt to
cross levels and relate URIs used in HTTP to URIs used in logical
propositions. The URIs reside squarely in the domain of discourse. The
meta-question just doesn't come up.  This is very much intentional.
Wash, rinse, repeat.

> 6. Regarding:
> [[
> (candidate) For any set {S1, S2, ...} of 'simple IRs' there exists an IR
> that has S1, S2, ..., as readings, and no others.
> ]]
> This seems pretty reasonable, if we interpret "there exists" to mean "in
> theory there could exist".

good point, but 'in theory' doesn't belong in an axiom and is not what
I want to say.  There is probably a better way to say this; what I
want to express is that if you start doing GETs at a URI and a bunch
of 200s come back, then there exists an IR that has those readings...
but we can't say *only* those readings. This is probably better
formulated in terms of authorization: For any http: URI, the
authorization policy and choices that they make etc. etc.  Will work
on this.

Maybe strike entirely because the only reason you'd want it would be
to implement the letter of the httpRange-14 resolution, and I have yet
to see why this is forced.

> 7. Regarding:
> [[
> (def) An interpretation of an RDF graph 'respects IR bindings' if, for
> each dereferenceable URI occurring in the graph (outside of literals),
> the URI is interpreted to be an IR bound to that URI.
> ]]
>
> Nice!  I think the wording will have to be tightened up slightly, but I
> think the idea is very good.
>
> 8. Regarding:
> [[
> A satisfying interpretation 'respects IR bindings' if it is also
> satisfying for the graph formed by merging (a) the given graph, (b) a
> set of 'binding statements', one per dereferenceable URI occurring in
> the graph,
> ]]
>
> Yes!  Item (b) corresponds to step 1.a in the process for determining
> resource identity that I proposed at the Semantic Technology Conference
> last year
> http://dbooth.org/2010/ambiguity/paper.html#part3_1a
>
> 9. Regarding:
> [[
> and (c) an appropriate RDF axiom set derived from the above axioms. The
> 'binding statement' for a URI uuu is defined here to be the statement
> <uuu> :boundTo "uuu"^^xsd:anyURI.
> ]]
>
> Yes!  That is exactly what my proposed n3 rules for an HTTP 200 response
> do:
> http://www.w3.org/wiki/AwwswDboothsRules
> [[
> 150. # httpRange-14 rule: 200 response => InformationResource
> 151. # http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html
> 152. {       ?r uri:hasURI ?u .
> 153.    ?u http:hasGetReply ?reply .
> 154.    ?reply http:hasStatusCode 200 .
> 155.    # ...
> 156. } => {
> 157.    ?r a awww:InformationResource .
> 158.    # ...
> 159.    } .
> ]]
>
> FYI, your :boundTo property corresponds to the log:uri property defined
> by TimBl
> http://www.w3.org/2000/10/swap/doc/Reach
> and the :hasUri property that I defined at line 95:
> http://www.w3.org/wiki/AwwswDboothsRules

I can't use log:uri because it doesn't mean what I want.  Here's the definition:

  <comment>This allows one to look at the actual string of the URI
which identifies this.

You :hasURI doesn't work for me either because it's defined like this:

The subject resource is denoted by the object URI.  It is basically
the same as log:uri, but has a range of xsd:anyURI, so that a simple
assertion like {r hasURI u} will cause u to be recognized as type
xsd:anyURI without having to assert it explicitly.  This property
should be asserted explicitly -- it is NOT inferred.

The difference is that I give an objective criterion for deciding
whether the property holds.

> 10. In this statement:
>
>  [:isBoundTo "<http://example/z>"^^xsd:anyURI]
>
> Don't you have extra angle brackets around that URI?  Shouldn't it be:
>
>  [:isBoundTo "http://example/z"^^xsd:anyURI]

Yes, good catch

> 11. One thing that isn't mentioned, and may not need to be addressed in
> this document, but needs to be addressed somehow/somewhere: In reality,
> URI-resource bindings change (slowly) over time, when domain names are
> sold or web sites are reorganized.

Yes, I need a section that talks about how you could have different
interpretations depending on time scope.  If you're using the theory
within a narrow time window or within a particular conversation there
may be more or different entailments than if you're using it across a
very long time span. E.g. if you're talking to your bank it might be
understood that the IR is one specific to your session, while someone
doing a review of the bank's web site might interpret the symbols in
the theory as associating the same URI with a different (more generic)
IR.

Thanks for the close reading.
Jonathan

> --
> David Booth, Ph.D.
> http://dbooth.org/
>
> Opinions expressed herein are those of the author and do not necessarily
> reflect those of his employer.
Received on Tuesday, 1 March 2011 13:42:00 UTC