AWWSW status (F2F prep) from Jonathan Rees on 2010-03-17 (www-tag@w3.org from March 2010)

From: Jonathan Rees <jar@creativecommons.org>
Date: Wed, 17 Mar 2010 14:15:01 -0400
To: www-tag@w3.org
Message-ID: <760bcb2a1003171115o2b128838n1e7bf0b20d7e448f@mail.gmail.com>

AWWSW status (F2F prep)

The AWWSW group was started because Alan Ruttenberg and I were doing
quite a bit of ontology design and ontology advising and didn't
understand the resource/representation relationship (and the
"information resource" idea, which is intimately bound up with it)
well enough to do our work or guide others. The question comes up
when you have things that you want to give a URI to, and you want to
use 200 responses (non-# non-303 URI), but want to be protected
against someone coming along later and saying "hey, that's not an
information resource," or "but you said it's an IR, and that implies
xxx" where you don't mean to say xxx, or "that's an IR, but not the one
you want it to be".

This is dual (equivalent) to the question: Suppose you get 200
responses, is it OK to then decide that the named resource
is some particular thing or has certain properties? E.g., if I am
the owner of dx.doi.org, can I say that the URI
http://dx.doi.org/10.1093/bib/bbn051 names the journal article
that's indicated in the representation (so that I can license others to use
the URI when recording metadata)? (Note that this is a subtle
example. The httpRange-14 rule by itself is not adequate to rule this in or
out. In particular the representation might fail to be "of" the
journal article even if we decide the journal article is an IR. Also
there is redirection involved, which complicates things further.)

Alan and I approached the TAG, which said essentially "you figure it out."
(Shortly thereafter I discovered that I was on the TAG.)

Some ontologies where this is an issue include FRBR, Dublin Core,
Bibo, SWAN, CiTO, IAO, and IRW, but as the practice of metadata
deployment, document and media annotation, etc. increases
(perhaps with the help of the Link: header?), I expect there to be
many more.

A broader motivation, which I share with TimBL, is that if we had a
logical framework (perhaps expressible in RDF or OWL), we'd have a
tool that we could use to help clear up a number of number of web
architecture muddles. httpRange-14 is just an example; another
recent one on www-tag was "are HTML elements information
resources?"

A third motivation is that an RDF vocabulary for webarch could be
useful in a number of application domains, e.g. testing and
validation, or recording change logs (e.g. Memento), or "HTTP over
SPARQL", or further developing Tim's generic resources ontology
(genont).

Additional concerns have been raised in the group about how
URIs might become bound to things, but I have not pursued
this (yet). My current theory is that URI binding is a personal matter
subject to your belief set, and how you come to that is your
own business. You may choose to let what happens on the
Web influence your beliefs, and there may be a recommended
elective way to allow this to happen, and
perhaps an outcome of this project, in the future, might be
such a way.

I can't say we've made a lot of visible progress, but I think I do
understand the problem better now that I did before.

First, Roy Fielding is right: We're not just talking about HTTP
semantics, but rather the semantics of that part of web architecture
that is expressible in HTTP. This includes the
resource/representation relationship, the various redirects (including
303), and possibly existence (creation and deletion). I think webarch
as deployed might include REST as a subset, but certainly there are
resources deployed using GET+200 that do not obey REST discipline, and
we need to account for these somehow.

Second, TimBL has provided more information about his view of what is
and isn't an information resource, and he thinks they're like. I have
been unable so far (my
inadequacy) to combine these use cases with other constraints (such as
grandfathering all possible web pages) into an actionable definition
that makes sense to me, but I continue to work at it.

Third, "authoritative" per the updated http: URI scheme in HTTPbis is,
I think, orthogonal to the R/R problem. The "authoritative" responses
do not determine the resource uniquely, they only say that it belongs
to a class of resources that participate in the R/R relationships
communicated by the responses. A contradiction between an
"authoritative" response and other information believed about the
resource might lead you to discount the "authoritative" response (as
recommended by the GBIF persistent identifiers report) or to
stop using that URI to name the resource, just as easily as it might
lead you to doubt what you thought you knew about the resource.

Of course, the ability of an agent to speak HTTP-authoritatively about
a resource may be due to the agent's ability to control the resource
and therefore its "representations". For these particular resources,
the R/R relationship holds because the agent says so. For others
(such as Moby Dick) it might hold in spite of what the agent says.

I am concentrating on the resource/representation relationship. My
ambition is that if we have a story about when this holds and doesn't
hold - in particular how to falsify it - then answering the
question "what is an information resource" will fall out as a side
effect: an IR is simply something which happens to be able to
participate in this relationship.

So far the best lead I've encountered so far in understanding the
relationship is ABLP logic, as is being pursued by Dan Connolly. It
may be that ABLP can't be used directly, as convincing someone that a
web page is a principal, or that "principal" has any ontological
consequence, might be a tough sell. Or it may be that this, too, is
an ontological wild goose chase, or that ABLP is about
the URI/resource relationship instead of the resource/representation
relationship. But it's worth pursuing.

Open issues on which these considerations impinge:
ISSUE-50 URNs and registries - persistence vs. trust in "authority"
ISSUE-57 HTTP redirections - consequences of 30x
ISSUE-63 metadata architecture - metadata for http:-named resources
ISSUE-53 generic resources (appears to be closeable)

Next step (for me): Look in more detail at the kinds of metadata,
including class memberships, one might want to write using the
abovementioned ontologies for some sample resources,
and attempt to generalize from there.

I'll try to have slideware ready in time for the F2F.

Thanks to Michael Hausenblas and David Booth for their help.
This email is in the first person because they haven't
seen it to agree with it or not, but I am happy to expand
"I" to "we" for anything they want to take credit for above.
Thanks also to many others including Alan, Tim, Harry Halpin,
Stuart Williams, and Noah for their contributions.

Jonathan

too pressed for time to look up URIs for all the things cited. here
are the obscurest ones:
memento: http://www.readwriteweb.com/archives/memento_protocol-based_time_travel_for_the_web.php
gbif: http://www2.gbif.org/Persistent-Identifiers.pdf
iao: http://code.google.com/p/information-artifact-ontology/
genont: www.w3.org/DesignIssues/Generic.html
the others you should be able to get from google or tracker.

Received on Wednesday, 17 March 2010 18:15:35 UTC