Re: Requirements for Any Theory of "Information Resource"

More comments below . . . 

On Tue, 2011-03-01 at 08:40 -0500, Jonathan Rees wrote:
> On Mon, Feb 28, 2011 at 2:36 PM, David Booth <david@dbooth.org> wrote:
> > I thought I would take this opportunity to provide some feedback on
> > Jonathan's draft: Requirements for Any Theory of "Information Resource".
> > http://www.w3.org/2001/tag/awwsw/ir-axioms/20110225
> >
> > 1. Regarding:
> > [[
> > The challenges to explaining "information resource" are: (1) to make
> > "information resources" logically independent of Web dereference (and in
> > particular the HTTP protocol), while saying rigorously what it means for
> > one of them to be "on the web" at a given URI
> > ]]
> > Perhaps define an abstract "AGET", which is implemented as GET in the
> > HTTP protocol, and in other ways by other protocols.
> 
> Not sure what problem this solves or where to work it into the text.
> Is there some confusion in the abstract that needs to be straightened
> out?

Sorry, I wasn't proposing a change.  I was just noting how this is
usually handled in software engineering.  You can ignore my comment.

> 
> > 2. Regarding:
> > [[
> > The main contribution of this note is to say that the properties of an
> > information resource are those that are invariant over its
> > "representations".
> > ]]
> > That sounds pretty good.  But isn't something like that true of *all*
> > resources?  As in "the properties of a resource are those that are
> > always true of that resource"?  BTW, those invariants are the assertions
> > that should be included in the URI's declaration.
> 
> Um, no, the statement isn't trivial. It relates representation
> properties to IR properties in a testable manner, and it only applies
> to certain properties.
> 
> I took a risk that I could leave it as "the properties" here, but you
> have shown me that this needs to be qualified. Will change the
> wording, how about this?
> 
>   The main contribution of this note is to say that many
>   important properties of
>   an information resource are invariant over its
>   "representations".  If one knows such a property of an information
>   resource, one can make falsifiable predictions about its retrieved
>   "representations", and knowing about the "representations", one can
>   conclude properties of the information resource.

I actually think your original wording was good enough -- perhaps
better, because it is shorter.  Again, I did not mean to suggest that a
change to the text is needed, I was merely commenting.

> 
> 
> > 3. Regarding:
> > [[
> > 'Has reading' is not functional, because we want to admit
> > interpretations where readings vary by media type, language, session,
> > time, whim, etc.
> > ]]
> > But you could make 'has reading' functional: just include time and
> > request as arguments, as in ftrr:InformationResource
> > http://lists.w3.org/Archives/Public/public-awwsw/2008Apr/0046.html
> > Or you could let the request be broken apart into multiple variables.
> 
> This would be just one particular interpretation. It is not a
> requirement and I see no reason to try to foist that particular model
> on everyone.
> 
> (I have said this to you dozens of times over the years and just don't
> understand why we're not communicating.)

Uh . . . I don't know what pushed your button on that one, since this
was the first time I had even seen the 'has reading' relation, but:

1.  The sentence in your draft said "'Has reading' is not functional,
*because* . . . ." (my emphasis).  The word "because" suggests that 'has
reading' *must* not be functional, for the reason given.  I was pointing
out that the fact that readings could indeed "vary by media type,
language, session, time, whim, etc.", but "has reading" could *still* be
functional.  If you don't want it to be functional for other reasons,
then fine, but it is wrong to imply that it is functional *because*
"readings vary by media type, language, session, time, whim, etc.".

2. Regarding your comment about my attempt to "foist" a functional model
on everyone, you have introduced a *relation* 'has reading'.  A function
is a special kind of relation.  Generally when one is attempting to
model or explain something, it is best to choose the *most* specific
designator that is applicable -- "function" being more specific than
"relation".  Furthermore, being functional corresponds closely Roy's
REST model, RFC 2616 and the way a web server works: it receives a
request at a particular time and returns a result that depends only on
the request and (conceptually) the time.  (The time parameter is a
catch-all to account for any other outside stimulus, such as current
weather conditions.)

So let me turn this around: if 'has reading' *can* be defined to be
functional, why *not* do so?  If it would add more complexity than would
be worthwhile, then fine, let's not do it.  But I *do* think there is
value in doing it if it doesn't add too much complexity, because it
would correspond more closely to the real world.

> 
> > If we consider a set of variables v1..vn under which 'has reading'
> > becomes functional: f(v1, ... vn), then when you talk about metadata
> > properties:
> > [[
> > Let P be a metadata property, let R be an 'information resource', and
> > let x be a member of the range of P. Then P(R,x) if and only if P(S,x)
> > holds for all readings S of R.
> > ]]
> > I wonder if there would be a way to
> >
> > I like the general direction you're going in, when you talk about
> > metadata properties:
> > [[
> > Let P be a metadata property, let R be an 'information resource', and
> > let x be a member of the range of P. Then P(R,x) if and only if P(S,x)
> > holds for all readings S of R.
> > ]]
> > However, I'm not certain of the "only if" direction.  The "only if"
> > direction seems to be saying that R cannot have any property that is not
> > observable through some S.  Maybe that's correct -- I'm not sure.  It
> > reminds me of the discussion of intensional versus extensional semantics
> > in RDF
> > http://www.w3.org/TR/rdf-mt/#glossIntensional
> 
> You need one direction in order to be able to write metadata. You need
> the other in order to interpret it. The whole framework is vacuous,
> inconsequential, if you remove either direction.

Yes, that sounds good.

> 
> > If you think of R as a function f of a set of n variables v1..vn:
> > f(v1, ... vn), and c1 is a constant and f1 is a derived function of n-1
> > variables v2..vn such that:
> >
> >  f1(v2, ... vn) = f(c1, v2 ... vn)
> >
> > then it makes sense to try to relate properties that hold of f to
> > properties that hold of f1, but how?  Perhaps it can be done if we
> > restrict those properties to ones whose truth is determined solely by
> > the observable values of f(v1, ... vn):
> >
> >  MP is the set of properties with domain 'information resource'
> >  such that for any P in MP and any information resource f, there
> >  is a corresponding property ("little-p") p, such that: P(f) iff
> >  p(f(v1, ... vn)) for all values of v1..vn.

BTW, I should have said there is a *particular* p ("little-p") for each
P -- let's call it lp(P) to avoid relying on uppercase/lowercase
distinctions.  So:

  P(f) iff lp(P)(f(v1, ... vn)) for all values of v1..vn

> >
> > Maybe something like this is what you're getting at?
> 
> I'm not getting at that particular model, but that seems like a fine
> way to interpret the theory, if it helps.

I wasn't proposing an *interpretation* of the theory.  I was
brainstorming about other ways the theory might be defined -- ways that
could correspond more closely to existing practice.  Just brainstorming.

> 
> If you're saying there's a missing axiom then that's something I'd
> like to know.
> 
> > 4. I like this idea:
> > [[
> > The IR must 'fess up' to things that all of its readings do. This rules
> > out pathologies where something is P-related to all of a document's
> > translations but not to the IR itself. The practical benefit is that it
> > lets you 'gamble' on hypotheses of an IR formed by investigating a
> > number of its readings. You're not guaranteed to be right, but you may
> > be willing to act on the hypothesis.
> > ]]
> 
> I think this is the answer to the question you asked about the two
> directions of the 'iff'?

Yes.

> 
> > 5. Regarding:
> > [[
> > (def) An 'information resource' is 'bound to' a URI iff every simple IR
> > that is 'authorized for' the URI 'is a reading of' the information
> > resource.
> >
> > This formalizes "on the web".
> > ]]
> >
> > I think that's a pretty good definition of something, but I'm not sure
> > it formalizes the notion of an information resource being "on the web",
> > since that definition only talks about the binding between a URI and an
> > information resource.  Usually when something is "on the web", there are
> > representations available via GET.
> 
> Hmm. You're right, maybe I should strike the comment as misleading.
> For now I just added a disclaimer:
> 
>   <p>This formalizes "on the web", at least in part.  (You might
>   expect other things to hold as well.)</p>

Yes, that helps.  I suggest further adding: ", such as that
representations are available in response to requests, but for the
current purposes, this much is enough."

> 
> > But maybe you're using the word "binding" different than I expect.  I
> > think of a binding as an association between the URI and the information
> > resource, just as an "interpretation" in RDF semantics maps URIs to
> > resources.
> 
> Nope. It's just a statement about authorization. That induces a
> certain relationship, but it has nothing to do with the model theory,
> it's related to the problem domain.
> 
> Note that the axioms have no URIs in them. There is no attempt to
> cross levels and relate URIs used in HTTP to URIs used in logical
> propositions. The URIs reside squarely in the domain of discourse. The
> meta-question just doesn't come up.  This is very much intentional.
> Wash, rinse, repeat.

Okay, I see.

> 
> > 6. Regarding:
> > [[
> > (candidate) For any set {S1, S2, ...} of 'simple IRs' there exists an IR
> > that has S1, S2, ..., as readings, and no others.
> > ]]
> > This seems pretty reasonable, if we interpret "there exists" to mean "in
> > theory there could exist".
> 
> good point, but 'in theory' doesn't belong in an axiom and is not what
> I want to say.  There is probably a better way to say this; what I
> want to express is that if you start doing GETs at a URI and a bunch
> of 200s come back, then there exists an IR that has those readings...
> but we can't say *only* those readings. This is probably better
> formulated in terms of authorization: For any http: URI, the
> authorization policy and choices that they make etc. etc.  Will work
> on this.

Right.

> 
> Maybe strike entirely because the only reason you'd want it would be
> to implement the letter of the httpRange-14 resolution, and I have yet
> to see why this is forced.
> 
> > 7. Regarding:
> > [[
> > (def) An interpretation of an RDF graph 'respects IR bindings' if, for
> > each dereferenceable URI occurring in the graph (outside of literals),
> > the URI is interpreted to be an IR bound to that URI.
> > ]]
> >
> > Nice!  I think the wording will have to be tightened up slightly, but I
> > think the idea is very good.
> >
> > 8. Regarding:
> > [[
> > A satisfying interpretation 'respects IR bindings' if it is also
> > satisfying for the graph formed by merging (a) the given graph, (b) a
> > set of 'binding statements', one per dereferenceable URI occurring in
> > the graph,
> > ]]
> >
> > Yes!  Item (b) corresponds to step 1.a in the process for determining
> > resource identity that I proposed at the Semantic Technology Conference
> > last year
> > http://dbooth.org/2010/ambiguity/paper.html#part3_1a
> >
> > 9. Regarding:
> > [[
> > and (c) an appropriate RDF axiom set derived from the above axioms. The
> > 'binding statement' for a URI uuu is defined here to be the statement
> > <uuu> :boundTo "uuu"^^xsd:anyURI.
> > ]]
> >
> > Yes!  That is exactly what my proposed n3 rules for an HTTP 200 response
> > do:
> > http://www.w3.org/wiki/AwwswDboothsRules
> > [[
> > 150. # httpRange-14 rule: 200 response => InformationResource
> > 151. # http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html
> > 152. {       ?r uri:hasURI ?u .
> > 153.    ?u http:hasGetReply ?reply .
> > 154.    ?reply http:hasStatusCode 200 .
> > 155.    # ...
> > 156. } => {
> > 157.    ?r a awww:InformationResource .
> > 158.    # ...
> > 159.    } .
> > ]]
> >
> > FYI, your :boundTo property corresponds to the log:uri property defined
> > by TimBl
> > http://www.w3.org/2000/10/swap/doc/Reach
> > and the :hasUri property that I defined at line 95:
> > http://www.w3.org/wiki/AwwswDboothsRules
> 
> I can't use log:uri because it doesn't mean what I want.  Here's the
> definition:
> 
>   <comment>This allows one to look at the actual string of the URI
> which identifies this.
> 
> You :hasURI doesn't work for me either because it's defined like this:
> 
> The subject resource is denoted by the object URI.  It is basically
> the same as log:uri, but has a range of xsd:anyURI, so that a simple
> assertion like {r hasURI u} will cause u to be recognized as type
> xsd:anyURI without having to assert it explicitly.  This property
> should be asserted explicitly -- it is NOT inferred.
> 
> The difference is that I give an objective criterion for deciding
> whether the property holds.

Yes, you have given a better definition.  That doesn't mean we need to
mint a new term.  The existing definition of log:uri is pretty much
vacuous semantically, so it could be used in conjunctions with the
criteria that you have set out.  I think it is helpful to use terms that
are already in use if possible.

OTOH, if you really want to mint a new term then I think we should at
least indicate (in a comment) its relationship to log:uri.

> 
> > 10. In this statement:
> >
> >  [:isBoundTo "<http://example/z>"^^xsd:anyURI]
> >
> > Don't you have extra angle brackets around that URI?  Shouldn't it be:
> >
> >  [:isBoundTo "http://example/z"^^xsd:anyURI]
> 
> Yes, good catch
> 
> > 11. One thing that isn't mentioned, and may not need to be addressed in
> > this document, but needs to be addressed somehow/somewhere: In reality,
> > URI-resource bindings change (slowly) over time, when domain names are
> > sold or web sites are reorganized.
> 
> Yes, I need a section that talks about how you could have different
> interpretations depending on time scope.  If you're using the theory
> within a narrow time window or within a particular conversation there
> may be more or different entailments than if you're using it across a
> very long time span. E.g. if you're talking to your bank it might be
> understood that the IR is one specific to your session, while someone
> doing a review of the bank's web site might interpret the symbols in
> the theory as associating the same URI with a different (more generic)
> IR.

Anyway, great progress overall!



-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.

Received on Friday, 4 March 2011 18:53:56 UTC