RE: Example for consideration: Resource versus Representation from Williams, Stuart (HP Labs, Bristol) on 2008-02-04 (public-awwsw@w3.org from February 2008)

From: Williams, Stuart (HP Labs, Bristol) <skw@hp.com>
Date: Mon, 4 Feb 2008 13:24:08 +0000
To: Jonathan Rees <jar@creativecommons.org>
CC: Pat Hayes <phayes@ihmc.us>, Alan Ruttenberg <alanruttenberg@gmail.com>, "public-awwsw@w3.org" <public-awwsw@w3.org>
Message-ID: <9674EA156DA93A4F855379AABDA4A5C6119505CC3C@G5W0277.americas.hpqcorp.net>
> -----Original Message-----
> From: Jonathan Rees [mailto:jar@creativecommons.org]
> Sent: 01 February 2008 23:53
> To: Williams, Stuart (HP Labs, Bristol)
> Cc: Pat Hayes; Alan Ruttenberg; public-awwsw@w3.org
> Subject: Re: Example for consideration: Resource versus Representation
>
>
> On Jan 25, 2008, at 10:11 AM, Williams, Stuart (HP Labs,
> Bristol) wrote:
> >
> >> Assigning a URI incurs a sort of moral obligation to resolve it
> >> somehow, but lack of resolution doesn't make the assignment invalid.
> >> (We all agree on this, right?)
> >
> > Yes, if we are speaking of http scheme URI.
> >
> > For URN's (ie. URN namespaces) it intentions are not clear. ...
>
> This is an interesting discussion, but let's put it off, if
> that's OK, as it doesn't really belong to AWWSW, which is
> only about HTTP.

Hmmm.... I guess I'm willing to set that on the side for now. However, in the long run I don't believe that AWWSW should be so narrowly scoped - if the first WW is to have any meaning... plus the core technologies place no restrictions on the 'spellings' of URIs beyond that defined in "Uniform Resource Identifier (URI): Generic Syntax" [1] and "Internationalized Resource Identifiers (IRIs)" [2].

[1] http://www.ietf.org/rfc/rfc3986.txt
[2] http://www.ietf.org/rfc/rfc3987.txt

> I'll make a note of it and maybe we can come back to it
> later. When I spoke of "moral obligation" I was referring to
> the AWWW "Available representation" principle [1], which
> implicitly says you shouldn't use any URI scheme that lacks
> resolution.
>
> Oddly, the principle is stated to apply to all resources,
> while elsewhere it is intimated that only IRs have
> representations. So there is no way to obey this principle
> for non-IRs.

Bear in mind that such ink as dried in AWWW, dried prior to the TAG's resolution of httpRange-14 - ie. it was left open whether or not representations could be retrieved for things other than IRs.

> >> In order to write meaningful RDF, you have to have subjects and
> >> objects, and verbs (= predicates = properties). A fundamental
> >> assumption - speak up now if you don't believe this - is that to be
> >> clear and useful a property [a terrible word but we're stuck with it]
> >> must have a specified domain and range -- classes to which the
> >> subject and object must belong in order for statements using the
> >> property to be acceptable in discourse.
> >
> > Hmmm...(speaking up) I think that we need to think about that. In some
> > of the communities in which I work the practice seems increasingly to
> > leave property domains as open as possible to encourage their re-use.
>
> There are clearly many different ways to use RDF, and we may
> have a cultural clash here, as until now, in my provincalism
> I have not encountered anyone who argues against domain and
> range assertions. I would love to hear experience with other
> engineering approaches.
> Again it's an interesting conversation that's a bit wide of
> the AWWSW project, so I'll make a note of it. Let's try to
> hedge the issue for now and if you think we're running into
> trouble as a result please speak up.

I guess I beg to differ in as much that *if* we are to have something as grandly titled as an AWWSW, then it should cover the common ground across a wide spectrum of use. Something more narrowly scoped may indeed be an architecture... but it is less clear to me that it could claim to the AWWSW.

<snip>shared OWL Composite key wish list</snip>

> > Ok... though I think that there is a premise in that which is perhaps
> > again not universally held. Roughly, one accumulates
> > statements/assertions about things of interest by retrieval operations
> > over the web. Individual representations may say quite contradictory
> > things - eg. in the http://sw-app.org/mic.xhtml example that we
> > considered on the call: IMO the contradiction is quite evident from an
> > understanding of the representation's media- type and it's content -
> > rather than from any fine detail of the HTTP interaction - and their
> > aggregation is quite another thing.
>
> Agreed.

Ok... then I guess that we need a better concrete example to explore in order to understand the kind of inferences that we would like to be able to justify.

> But there are other sources of statements than
> representations. Agents make assertions about what they
> observe or infer or conjecture all the time, then render
> their wisdom as RDF that finds its way into HTTP responses.

Ok... does that amount to some 'provenance' information that gives an account of the derivation of some collection of RDF statement?

> Tabulator, for example, observes the HTTP interactions that
> it initiates, records the basic facts of the matter as RDF,
> make a few simple inferences, and renders what it knows as
> RDF. This may be odd, but it's not extraordinary and in fact
> is exactly the kind of thing RDF is for.

I don't disagree.

> > It seems evident to me that Pat, in messages such as [4] (that one in
> > particular I greatly appreciate) and other related messages, urges us
> > to make a few inferences as possible - perhaps ideally, well none -
> > from what I might call the 'fine detail' of an http interaction.
>
> The RDF constructed by Tabulator is mostly simple
> observation, not inference, so I can agree with you. Saying
> that the entity "represents" the resource (should we choose
> to do so) would be a much bigger step, and we need to drill
> down on this deeper kind of relationship. See below.

<snip/>


> > All that said, I may also have misunderstood Pat's advocacy - but I
> > think that it's close to, infer nothing from the response codes.
>
> Certainly, conservatively one should infer nothing.  But
> conservatism would be a choice. If I can pretend to be Tim,
> the point is not to conservatively say "we can't trust it so
> we can't infer anything" but rather to say "what should the
> architecture be so that among agents adhering to it we start
> to have interesting conversations".
>
> > In part that's why I have been seeking to have the inferences driven
> > from the other end - what inferences do you want to be able to justify
> > - which ought to lead us to 'proof-steps' that depend on inferences
> > that can only be made on the basis of interaction detail
> > - or pehaps not - maybe we will find that the content of
> > representations is in general sufficient.
>
> I'll just mention information that I (on behalf of Science
> Commons) care about, since it's difficult for me to get past
> it right now.
> Most of the it is beyond anything represented in the HTTP
> interaction and therefore probably beyond what the AWWSW task
> can do, but I'll state it as a sort of pie-in-sky.

> Who wrote this resource?

Author, creator, owner, maintainer... I guess that there are a few agents may have a relation with the resource that you would be interested in. Representations may carry some self-describing information wrt to some of those. I'm not aware of any HTTP headers that would carry such information... that's not to say there aren't or couldn't be any, just that at present I'm not aware of any.

> How stable is the state of the resource - can
> I depend on it remaining the same for a while?

Cache-control headers may be helpful in that the can convey an expiry date for cachable representations. I guess that you could set them with seconds to years of stability - and obviously, you're at least saying it's ok to use a representation up to its expiry date, i don't think you're necessarily committing that the available representation(s) or the resource state is invariant over that period - though that might be a reasonable claim. I read HTTP caching as trading speed of response for currency of representation. Etags are probably of interest as well.

> To refer to what I see now, can I link to this URI or do I have to copy the content?

IMO... in general it is not possible to link to "what I see now". The link is a reference to a resource not it's representation. I would be possible to create resources whose sole purpose is to provide an enduring snapshot of a related resource. eg. documents on the W3C TR page use a convention that achieve something like this - but that is a site specific convention.

> What are the available representations?

I don't know of any way of reliably determining that. There are probably some heuristics that may work in certain circumstances - but I suspect it would rely on repeated trail and error varing acceptable content types in a requests ACCEPT header (if present).

> If an archival copy exists, where is it?

I don't think you could determine that from HTTP headers. It may be regarded as self-descriptive information in some representations in the sense of declaring a relation with some other resource.

Some of the work going on in the library and bibliographic communities is probably relvant - though may stray into the domain of info: doi: and URNs more generally.

> Is a mirror, or copy, that I create as good as the resource I'm trying to mirror?

As above for archive copies.

> And for non-IRs:
> Where can I find descriptions of the thing?

Well, for # URIs, the first port of call is at least straight forward.

For non-# URI, then the TAG's 303 advice provides a roughly equivalent mechanism.

The expectation is that folks deploying URI for such non-IRs will *want* you to be able to find out about them (ie. find some form of description) and it is in their best interest to deploy something useful by either of these means. Of course, as things stand, there are no guarantees with either approach that a retrieved representation will in fact have anything to say about the resource you were initially interested in.

> How is the URI intended to be used?

Is that the same as "What the URI is intended to denote?" eg. that a URI denotes an rdf:Property 'probably' indicates that the URI is intended to be used in the 'predicate' position of (most) RDF triples in which it occures.

> Is the description accidental and time- or hypothesis- bound, or is it essential to what it means for
> the URI to denote what it does (i.e. for one to be playing its language-game)?

I think that this is far removed from what you can expect to be able to conclude from an HTTP interaction.
I think that this at least requires a richer vocabulary for annotating RDF properties and may be classes
along the lines mentioned earlier (and <snip/>'ed) about being able to state what the distinguishing properties of an individual of given class are.

> The ability to communicate this kind of information is as important as the ability to discover it.
>
> I'm hoping that we can lay a rudimentary basis for further work on issues like these.
>
> Best
> Jonathan
>
> [1]
> http://www.w3.org/TR/webarch/Overview.html#representation-management

Regards

Stuart
--
Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England
Received on Monday, 4 February 2008 13:26:29 UTC