Fwd: [Moderator Action] Some preliminary notes towards an ontology review

---------- Forwarded message ----------
From:  <hhalpin@w3.org>
Date: Tue, Nov 17, 2009 at 7:19 AM
Subject: [Moderator Action] Some preliminary notes towards an ontology review
To: public-awwsw@w3.org


First, here's some preliminary notes on [1]:

What is needed is to align this with the IRW ontology I think. The
latest version is in an appendix of my thesis [2], but I'm having
trouble grabbing Valentina and Aldo so we can do an official
alignment. Here's a quick informal one. The primary difference is we
built upon AWWSW and RFC 3984, rather than RFC 2616. One fairly safe
we can define these classes I think is to *directly* quote the RFC.

Summary: Overall, the main interesting thing done by JAR's ontology is
to add the notion of correspondence and time-intervals where a
representation should be added. However, I do have the distinct
feeling that this ontology is done at such a high-level of abstraction
that its unclear how it would be used, and that we need some easier
notions (i.e. simply http:redirects would be useful) and that some
notions that would be useful for the Semantic Web and Linked Data seem
left out (I know their controversial, but people do seem to be working
with the notions of information resource).

HttpResource (was Rfc2616Resource) -> irw:Resource

   "Resource" in the sense of RFC 2616; something that's a suitable
candidate for some HTTP operation such as POST or GET. We can be
agnostic as to whether to interpret this narrowly ("network data
object or service") or broadly (as in RFC 3986 or RDF).

       NOTE: How do we attach this notion to a string that has an
actual "name" of the URI? In IRW, we introduced the hasURIString
property. Note that the broad definition is "Anything that might be
identified by a URI".

NetworkDataObject -> irw:WebResource

   One of two kinds of HttpResources listed in RFC 2616.

NetworkService ->  irw:WebResource

   The other of two kinds of HttpResources listed in RFC 2616.

NOTE: Is distinguishing these two really necessary?

HttpRepresentation -> irw:WebRepresentation

   Octet sequence + content-type + language. (My reading of
"representation" is the bag-of-bits sense, not the "on the wire"
sense; I feel the former is better supported by what RFC 2616 says.
The difference lies in whether the same representation can be
transmitted multiple times.) (The term "representation" derives from
the REST architecture in which one has a representation (or
representative) of the state of some REST resource. Here it is a term
of art. When the resource is a NetworkService the "representation" is
merely information issuing from the service, not bits that "represent"
anything in particular.) (Open question: whether differences in
content-encoding necessarily induce distinctions between
HttpRepresentations.) (When later we relate to AWWW this class becomes
a subclass of webarch representation (WaRepresentation).)

NOTE: RFC 3984's definition "A sequence of octets, along with
representation metadata describing those octets, that constitutes a
record of the state of the resource at the time when the
representation is generate" seems pretty equivalent to me, but maybe
it's a sub-class relationship.

Correspondence -> Kind of close to irw:accesses (and it's some
properties irw:response), but over a time delimited sense, and reified
as a class rather than a simple property.

   The association of an HttpRepresentation to an HttpResource over
some continuous period of time. That is, the fact of an
HttpRepresentation "corresponding to" the HttpResource starting at
some time, and ceasing to "correspond to" it at some later time. At
any given time many Correspondences may hold for a given HttpResource
(i.e. an HttpResource can simultaneously correspond to many
HttpRepresentations), and many Correspondences may hold for a given
HttpRepresentation (i.e. many HttpResources may correspond to a single
HttpRepresentations). (Alan requests motivation for the continuity
requirement: if you can infer correspondence at intermediate times,
you can cache. So "holes" want to be modeled by having multipe
Correspondences with the same resource and representation.)

NOTE: Thinking we might want a simple version that lets people not use
heldAt and WillHoldUntil unless they need to.

ofRepresentation -> kind of close to irw:accesses
   Relates a Correspondence to the HttpRepresentation that corresponds.

toResource -> kind of close to irw:accesses
   Relates a Correspondence to the HttpResource to which the
HttpResource's HttpRepresentation corresponds.

heldAt -> not in IRW

   Relates a Correspondence to a time at which the correspondence
holds. The start time of the Correspondence is no later than this
time. This relation might be inferred, for example, from the Date: or
Last-Modified: header of a 200 response to a GET.

willHoldUntil -> not in IRW

   Relates a Correspondence to a time just before which the
correspondence holds. The end time of the Correspondence is no earlier
than this time. This relation might be inferred, for example, from the
Expires: header of a 200 response to a GET.

Coresidence -> Interesting notion, but I'd be a bit more
straightforward and connect to the notion of redirection. Aren't you
aiming for a set of all redirections that eventually get you to a
resource?

   ... work in progress ... according to the definitions of 302 and
307, a resource can "reside at" a specified URI. If R1 is the resource
named by the request-URI (the one that resides elsewhere), and R2 is
the resource named by the redirection target, then let's say that
there is a Coresidence of R1 and R2, with properties similar to those
of a Correspondence. We want to be abstract enough here to encompass
two distinct interpretations: (1) (Jonathan's theory) R1 != R2 but
requests about the first can be satisfied by responses about the
second for the duration of the Coresidences, (2) (Stuart's theory) R1
= R2 but the server and client may change their minds about how to
interpret the target URI (it names R1/R2 now, but might name another
resource later).

hasLanguage: Here I think we should just import all the work from the
Tabulator about havin a language *and* media-type, but same property
exists in IRW.

   ... work in progress ... different senses possible: the content is
in the given language (wrong); the content is to be interpreted as
being in the given language (plausible and analogous to Content-type);
or maybe just hands-off, the value of the Language: property is some
string (the latter would put this out of scope for this note)

-- Why note

hasContentType -> irw:hasMediaType

   ... work in progress ... content is to be interpreted according to
the given media type (the one denoted by the value of the
Content-type: header) ----------------------------

Things missing:

Low level: Entity vs Entity Body, URI, redirects, the notion of
requests and responses (and so Agents),

High-level and probalby out of scope: the notion of information
resource (which  I think IRW does finally hit on the head correctly)
and associated descriptions.

These could be in different modules, but they would make the  ontology
much more useful for a number of use-cases.

               cheers,
                       harry

[1] http://www.w3.org/2001/tag/awwsw/http-semantics-report.html
[2] http://www.ibiblio.org/hhalpin/homepage/thesis/index.html#SECTION001130000000000000000

Received on Tuesday, 17 November 2009 14:28:08 UTC