[httpRange-14] What is an Information Resource?

Xiaoshu Wang gave a list of things recently and asked which ones are
information resources, but didn't receive a reply. I'm going to use
this opportunity to answer that question using what I've learned from
last week's threads, to summarise and elucidate.

We can start by discarding Webarch's faulty definition. Ian Davis
asked Tim on IRC whether an RDF Graph is an information resource [1],
and Tim replied "no". This is consistent with how CWM implements the
web, relating an IR to its Graph using the log:semantics property. But
it's not consistent with Webarch's definition that "all of their
essential characteristics can be conveyed in a message", because all
of a Graph's essential characteristics can be conveyed in a message
and yet it is not an IR.

In other words, the essential characteristics point is necessary but
not sufficient; so Webarch's definition is faulty. Correct me if I'm
wrong.

I think I may be starting to understand, however, what an information
resource is:

* An information resource is an abstraction of a set of
representations, strongly tied to HTTP.
* The concept of an information resource therefore did not exist before the web.
* The essential characteristics of an information resource can *only*
be conveyed in HTTP messages.

The only evidence against this interpretation that I've found so far
is Tim's assertion that a book is an information resource. But I don't
think he really means a book in the same way that we colloquially talk
of a book. A book served over HTTP 200 is a book in the same way that
Project Gutenberg is a library!

So now I can answer Xiaoshu's question, of whether the following
things are information resources or not:

   1. A book - Maybe.
   2. A clock - No.
   3. The clock on the wall of my bedroom - No.
   4. A gene - No.
   5. The sequence of a gene - No.
   6. A software - No.
   7. A service - Maybe.
   8. A namespace - Yes.
   9. An ontology - Yes.
  10. A language - No.
  11. A number - No.
  12. A concept, such as Dublin Core's creator - No.

How was I able to answer "Yes" to 8 and 9? Because a namespace doesn't
have any other characteristic than being a subclass of information
resource. Likewise, an ontology is defined as being a particular
subclass of an information resource.

In other words, ask yourself what a namespace is. It's a mechanism for
disambiguating names in XML using a URI and a local name, the pair of
which we call the QName. What does the URI denote? It doesn't matter
except that it's a good idea to serve some documentation about the
namespace, so in Webarch the TAG said we're defining a namespace as a
subclass of information resource. Job done.

What's an ontology? In the OWL sense, it's an information resource
that you can get the log:semantics of, which you then interpret by
using the OWL specification and the RDF specifications. Information
resource is being used as a carrier, because it's handy. It's what
made the web cool.

It's like when RoyF explained HTTP's verbs to us, and called the whole
thing REST, and people were scratching their heads for a while trying
to work it out. Now TimBL is explaining HTTP's nouns to us, and we're
scratching our heads again.

A book may be an IR if you mean it in the 21st century sense of a
book. A service may be an IR if it's a service like the W3C's List
Archives search form, but not if it's an electric company.

The other things are not information resources because they have
characteristics incompatible with the kind of thing that HTTP gives
you representations for. In other words, information resource is a
novel class. You can use it for novel things, like namespaces and OWL
ontologies, or HTML forms for searching archives, that make use of the
web. But you can't use it for things like cats and dogs (or "dogs and
cats" as Snoopy says).

[1] http://chatlogs.planetrdf.com/swig/2007-12-05.html#T16-08-53

-- 
Sean B. Palmer, http://inamidst.com/sbp/

Received on Saturday, 15 December 2007 17:41:10 UTC