RE: Example for consideration: Resource versus Representation

> -----Original Message-----
> From: Jonathan Rees [mailto:jar@creativecommons.org]
> Sent: 04 February 2008 17:59
>
> I appreciate your careful reply.
>
> On Feb 4, 2008, at 8:24 AM, Williams, Stuart (HP Labs, Bristol) wrote:
>
> >> -----Original Message-----
> >> From: Jonathan Rees [mailto:jar@creativecommons.org]
> >> Sent: 01 February 2008 23:53
> >> To: Williams, Stuart (HP Labs, Bristol)
> >>
> >> On Jan 25, 2008, at 10:11 AM, Williams, Stuart (HP Labs,
> >> Bristol) wrote:

<snip>Issues of scope, non-HTTP URI and name (AWWSW)</snip>

> Let me tell you what I have in mind regarding deciding what
> to focus on. I vaguely remember that we have an informal
> 6-month charter from the TAG, although I can find no evidence
> for this. We've used 3 of the 6. In any case, I'm trying to
> start small, and to this end I am trying to push away or
> postpone as many potential issues as I can.

Understood.

> >
> >>> Ok... though I think that there is a premise in that which is
> >>> perhaps again not universally held. Roughly, one accumulates
> >>> statements/assertions about things of interest by retrieval
> >>> operations over the web. Individual representations may say quite
> >>> contradictory things - eg. in the http://sw-app.org/mic.xhtml
> >>> example that we considered on the call: IMO the contradiction is
> >>> quite evident from an understanding of the
> representation's media-
> >>> type and it's content - rather than from any fine detail
> of the HTTP
> >>> interaction - and their aggregation is quite another thing.
> >>
> >> Agreed.
> >
> > Ok... then I guess that we need a better concrete example to explore
> > in order to understand the kind of inferences that we would like to be
> > able to justify.
>
> The first step, I think, is creating a sufficient vocabulary for
> *expressing* the kinds of assertions we'd like to be able to infer.

I'm not against doing that... indeed recording simple facts from an HTTP interaction is ok

[]      a       http-vocab:Transaction;
        http-vocab:requestURI   "<a request URI>"^^xsd:anyURI;
        http-vocab:requestHeaders ( "h1" "h2" h3" );
        http-vocab:requestBody    ...
      http-vocab:requestTime    "<YYYMMMDDDZHH:MM:SS>"
      http-vocab:response       ...;

and so forth possibly with more structure around requests and responses (or as subclasses of a httpMessage or some such).

But... I'd still be guided to some extent by the inferences that we want to make arising from those facts... and I keep asking for a concrete example... I'm particularly interested in what is required over and above what is asserted by the content of a (semantic web) resource representation (ie. RDF triples explicitly conveyed by the message content).

> This requires verbs, and I think we're starting to talk about this.
> Once we know how to say what we'd like to say, we can start
> talking about the circumstances under which those things
> *should* be said.

Yes... I think that means starting a little higher up than simply recording HTTP facts with the examples of the assertions we'd like to be able to infer so that we can look for the ground facts on which to base such inferences.

> >> But there are other sources of statements than representations.
> >> Agents make assertions about what they observe or infer or conjecture
> >> all the time, then render their wisdom as RDF that finds its way into
> >> HTTP responses.
> >
> > Ok... does that amount to some 'provenance' information that gives an
> > account of the derivation of some collection of RDF statement?
>
> OK, sorry, ignore what I wrote. I was not talking about
> provenance, just content. I was probably misunderstanding
> what you wrote.
>
> <snip/>
>
> >> Who wrote this resource?
> >
> > Author, creator, owner, maintainer... I guess that there are a few
> > agents may have a relation with the resource that you would be
> > interested in. Representations may carry some self-describing
> > information wrt to some of those. I'm not aware of any HTTP headers
> > that would carry such information... that's not to say there aren't or
> > couldn't be any, just that at present I'm not aware of any.
>
> There is no good way for cooperating agents to communicate
> this information; consider the case of a spreadsheet rendered
> as text/ plain. There's just no place in the representation
> to put any metadata. Sure, you can choose a different
> representation, but I wouldn't call that a "good" way to
> communicate. The issue of out-of- band metadata - in the GET
> response, or linked from the response, or as the response to
> a different request - has been discussed recently on
> semantic-web and/or www-tag, I think. I would say this is an
> architectural deficiency, and if it's not up to the TAG to
> fix it, it should be up to some other pro-HTTP group, as this
> deficiency (I believe) has been a factor in pushing many
> communities away from HTTP.

Ok... finding authoritative resource/representation metadata seems like a good TAG issue - its probably relevant under httpRedirections-57. Maybe there is a finding to draft there.

At a practical level, I'd guess that appropriate definition of a couple of response headers each that carry URI for the relevant metadata resources could surfice eg:

        Resource-Description: <anyURI>
        Representation-Metadata: <anyURI>


Or there may be existing headers with near equivalent intent. New headers would probably mean IETF review and process.

> So this is not an AWWSW thing, and I didn't mean to say it
> was... but if AWWSW lays a good foundation then some other
> effort will be on a better footing.
>
> >> How stable is the state of the resource - can I depend on it
> >> remaining the same for a while?
> >
> > Cache-control headers may be helpful in that the can convey an expiry
> > date for cachable representations. I guess that you could set them
> > with seconds to years of stability - and obviously, you're at least
> > saying it's ok to use a representation up to its expiry date, i don't
> > think you're necessarily committing that the available
> > representation(s) or the resource state is invariant over that period
> > - though that might be a reasonable claim. I read HTTP caching as
> > trading speed of response for currency of representation. Etags are
> > probably of interest as well.
>
> While technically this is correct I'm not sure it's practical
> (how often are publishers in a position to control the cache-control
> headers?) or that it carries the correct intent (it talks
> about server behavior, not the nature of the resource - if we
> can say these coincide then we've made progress, but I know
> of nothing so far that would imply that they do). But I'd be
> happy to explore cache control headers as one way to
> communicate this kind of information.

Ok... seems like at least one avenue of exploration.

Wrt arguements based on the practicality of controlling headers on a per resource basis... I suspect that a potential problem wrt to virtually any header based 'solution'.

> >> To refer to what I see now, can I link to this URI or do I have to
> >> copy the content?
> >
> > IMO... in general it is not possible to link to "what I see now".
> > The link is a reference to a resource not it's representation. I would
> > be possible to create resources whose sole purpose is to provide an
> > enduring snapshot of a related resource. eg. documents on the W3C TR
> > page use a convention that achieve something like this - but that is a
> > site specific convention.
>
> Sorry, let me rephrase:
> 1. Will the representation I retrieve now also be a
> representation of the resource tomorrow (even if it's not a
> representation the server still serves)?  (If we can say why
> that's an ill-formed parenthetical, we will have made
> progress.)

I think even more that the Expires and cache control headers may yield an answer to that question.

Of course there are questions of variances in spec. intend and established custom and practice.

> 2. Will the representation I retrieve now also be
> the representation that the same request will retrieve tomorrow?

Ditto... modulo a 'type'/'token' distinction around where by representation we mean a particular ephemeral message on a 'wire' or a bit/byte sequence conveyed by some set of identical messages (modulo some variation in header content - like timestamping and expiry and... ie. body may be identical headers may be different).

> Depending on the application, a "no" answer to one or the
> other might mean that the application will want to save a
> copy (instead of just saving a link).

Again I think Expiry and Cache-control are likely to hold the key.

> Site specific conventions are wonderful, and W3C's are a
> valuable example. How can they be communicated so that
> automated clients can exploit them?

At present, AFAIK, they are not... and over time the patterns have changed, and I'd be hard pressed to find even a human readable account of what they currently are, never mind a historical account.

>
> >> What are the available representations?
> >
> > I don't know of any way of reliably determining that. There are
> > probably some heuristics that may work in certain circumstances - but
> > I suspect it would rely on repeated trail and error varing acceptable
> > content types in a requests ACCEPT header (if present).
>
> Suppose a server wanted to communicate the answer. Wouldn't
> it be wonderful if that could be done using RDF?

Sure... its not so much the answer that is the problem, but how to pose the question.

Again, a Resource-description: header with a URI reference might well serve a representation with such an enumeration.

> >> If an archival copy exists, where is it?
> >
> > I don't think you could determine that from HTTP headers. It may be
> > regarded as self-descriptive information in some representations in
> > the sense of declaring a relation with some other resource.
>
> Again, I agree that we don't now have protocols that help with this.
> Wouldn't it be nice if we did - at the very least, a
> vocabulary that allowed us to talk about properties of
> servers? Maybe an AWWSW vocabulary would form some subset of
> such a vocabulary.

ok.

> Well, really at the very least would be a standard place to
> put information like this, even if we didn't standardize on
> the vocabulary.
>
> > Some of the work going on in the library and bibliographic communities
> > is probably relvant - though may stray into the domain of info: doi:
> > and URNs more generally.
>
> Wouldn't it be nice if the library community could layer
> their resources on top of the web, instead of going off and
> building a pile of incompatible formats, languages, naming
> schemes, and protocols?

yes.

> The goals of the two communities are very similar. If we want to say
> http: is broadly applicable, can't we make a case that it's
> good enough for libraries? info: and DOIs are a failure of
> web architecture (not sure whether technical or marketing), I
> think, but it may not be too late to repair this failure.

I think that there may also be other motivations at work here too...

> >> And for non-IRs:
> >> Where can I find descriptions of the thing?
> >
> > Well, for # URIs, the first port of call is at least straight forward.
> > For non-# URI, then the TAG's 303 advice provides a roughly equivalent
> > mechanism.
>
> > The expectation is that folks deploying URI for such non-IRs will
> > *want* you to be able to find out about them (ie. find some form of
> > description) and it is in their best interest to deploy something
> > useful by either of these means. Of course, as things stand, there are
> > no guarantees with either approach that a retrieved representation
> > will in fact have anything to say about the resource you were
> > initially interested in.
>
> Exactly. I think that if we articulated the conditions under
> which the follow-your-nose heuristics are not heuristics -
> even if only to give a name or phrase to such conditions -
> that would be of great value.
>
> >> How is the URI intended to be used?
> >
> > Is that the same as "What the URI is intended to denote?" eg. that a
> > URI denotes an rdf:Property 'probably' indicates that the URI is
> > intended to be used in the 'predicate' position of (most) RDF triples
> > in which it occures.
>
> Well, I tend to say "how x is used" instead of "what x
> denotes" in order to admit more use cases for RDF (sorry, I'm
> poking fun, please don't be offended) and to talk about
> aspects of use other than denotation, such as expiration date
> or examples... but that doesn't matter, assume what I mean is
> "what x denotes".
> I think this is related to the question of stability. One
> might like to use a URI in a persistent context - e.g. repeat
> something one has learned about the referent in an hour or a
> month. Some of the statements you learn about it may be true
> in a month, while others may not be. E.g. if now we know that
> U rdf:type Thermometer and U foo:has-temperature-Celsius
> "22", will the URI U still denote a thermometer one hour from
> now? Obviously it won't read the same temperature - but how
> did we know we weren't supposed to cache the temperature
> (cache control maybe)? If there were a notion of
> distinguishing definition from use, both of which currently
> occur inside the same descriptive document, we'd be in better shape.
>
> I'm referring here to the issue David Booth has raised in the
> form of "URI declarations": what is so true of the thing that
> if it weren't you'd have a different thing, as opposed to
> accidentally true, so that if it weren't you'd think you'd
> made an error of fact?

Ok... I thinks that related to the OWL'ish wish list stuff I mentioned earlier in the sense of for a given Class of 'thing' what (taken together) are their distingusihing properties. A description of that kind for an individual would contain assertions of at least such properties and maybe a few incidentals beside.

> I don't expect AWWSW or the TAG to solve this, but right now
> this is a hopelessly confused subject. Local solutions are
> easy, but the semantic web isn't supposed to be local, so I
> think some standards body ought to take up these issues.

:-)

>
> Best
>
> Jonathan
>

Regards

Stuart
--
Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England

Received on Tuesday, 5 February 2008 12:05:10 UTC