Back to HTTP semantics

OK, I think it would be useful to review where we are, before getting
too carried away again with comparison of conflicting models.

I know each of us had a different question coming into this. I will
mention here only the two I think we (or at least I) have been working
on:

1. When an agent responds to an HTTP request, what is it saying (what
can it be "held to"), and how might we capture that in RDF?
2. When someone (in RDF) *talks about* HTTP behavior, and wants to say
more about it than it says on its own, what sort of vocabulary would
be generally useful?

Re #1:

The answer to #1 clearly depends on prior agreement between requester
and responder. Usually this is just RFC 2616 and nothing else. It
might in particular cases be a more consequential contract (e.g.
webarch, httpRange-14, IRW, genont, Boothianism, Hayesianism), but we
cannot assume these in general and there are no generally agreed
markers that a responder is following any of these other conventions.

My reading of RFC 2616 is that the response says so little about the
resource that there is nothing we can usefully capture about it in
RDF. The entity "corresponds to" the resource (or the resource in its
current state), but that is so weak as to be useless. The response
does say quite a bit about the *entity* in the response (the
wa-representation), and we could attempt to capture that. And it is
certainly useful to record the uninterpreted fact of a particular HTTP
interaction, such as when it happened and what the request and
response were, in case that can be used in testing or in evidence or
hypothesis generation, but this is more the job of HTTP-in-RDF than of
AWWSW. But lacking further assumptions or information, the
wa-representations say almost nothing about the resource. If the URI
owner has even thought about what the URI names (unlikely), they might
adhere to just about any ontological or pragmatic stance, and still be
within the broad confines of RFC 2616.

So to the question, what is the responder saying about a resource? my
answer is, lacking other information or assumptions beyond RFC 2616,
nothing. We are therefore finished with this particular part of the
exercise.

Re #2:

As for ontologies that would help us talk about web resources, and
communicate assumptions, observations, and promises - from outside
HTTP, as it were - I think we can make some progress. The world
already has Dublin Core and FOAF widely deployed, so we ought to
analyze how they are being used. I think we should continue to review
IRW and HTTP-in-RDF to make sure they will help steer metadata
generation and dialog around architecture in a good direction. And I
still think we should aim to publish an ontology of some kind,
although it's not clear what should be in it.

It is important to distinguish between two cases: One where the URI
owner is providing the metadata, in which case it can be considered
constraining or "authoritative", and another where someone else is
providing the metadata based on what is observed in HTTP responses, in
which case it might be merely speculative. As an example of the
second, it is reasonable to say
    <http://dx.doi.org/10.1155/1995/10717> dc:creator _:1.
    <http://www.hindawi.com/86874642.html> foaf:primaryTopic _:1.
by simply looking at HTTP responses - even though the URI owners (who
are probably only dimly aware of RDF and have probably never heard of
web architecture) haven't said anything licensing these URIs as names
for resources that have these properties. This is spontaneous "folk
RDF" of the kind that I think was envisioned when RDF was first
developed, and although logically unsound, it works (is useful) for
any of a variety of reasons:
  - the "URI ownership" idea doesn't apply - the RDF author is
defining the URIs in a way that it finds useful (this is "squatting")
  - because the authors are relying on common sense instead of
specificational rigor
  - because the formulation has low overhead and high utility, and the
cost of being wrong is low.

(I'm not sure I want to endorse the above practice; just to admit that
it happens and has some merit.)

I think one way to explain the httpRange-14 restriction is as an
attempt to forestall conflicts between this kind of squatting and what
the URI owner might have to say.  For example, if I wrote the above
RDF, and was happy, and then the owner of
http://www.hindawi.com/86874642.html came along later and said
  <http://www.hindawi.com/86874642.html> rdf:type foaf:Person.
I'd be in a bit of a pickle. The URI ownership principle would say I
was wrong, and I'd be forced to either fix my content (which might be
hard) or thumb my nose at webarch. If Hindawi follows the httpRange-14
this situation won't arise.

By publishing an ontology we (AWWSW) would in effect be making a new
recommendation - new vocabulary that we suggest the community take up
for certain purposes. Some examples off the top of my head:
  - we could try to come up with a way to express useful contracts
such as a promise of unchanging (time-invariant) content that would
apply to any web-resource ontology, not just to genont.
  - we could take one of the FRBR or IAO classes, and come up with new
types or relations that would relate the nature of the named resource
to HTTP behavior (e.g. by saying that responses need to carry
information related in a definite way to the resource) - thus
explaining how/why these ontologies might apply to web resources.
  - we could explain how to take the above dc:creator example to a
higher degree of rigor - how to respect the specs and avoid making
statements that aren't well justified.

Re other work:

I haven't forgotten about the rest of the agenda - which I might
characterize as modeling aspects of AWWW and semweb architecture - but
unless someone convinces me that this is a prerequisite to #2, or that
#2 is not our best next task, I think we should continue to put it off
for a while.

Jonathan

Received on Wednesday, 10 June 2009 13:27:36 UTC