metadata subjects + 200 - a poll from Jonathan Rees on 2010-03-29 (public-awwsw@w3.org from March 2010)

From: Jonathan Rees <jar@creativecommons.org>
Date: Mon, 29 Mar 2010 12:56:14 -0400
To: AWWSW TF <public-awwsw@w3.org>
Message-ID: <760bcb2a1003290956x440069d6wcc157b6ea0185a4f@mail.gmail.com>
The below is similar to the question I sent recently(?) to Larry
Masinter cc: www-tag and, I hope, sharpens some remarks I've made
recently here. This story has nothing particularly to do with FRBR
[2], I'm only using it as an example. All the same questions arise
with any information-related ontology.

Alice and Bob have a nice conversation about some particular FRBR
expression, e.g. Moby Dick the original novel. They agree that they're
talking about the same entity, they agree on its properties, and so
on.

Then they go their separate ways. Alice mints URI http://example.org/a
meaning to use it as a name for the entity, and Bob mints URI
http://example.org/b as a name for the same entity. If they knew about
one another's actions, they would agree with the statement
<http://example.org/a> owl:sameAs <http://example.org/b>.

Now Alice arranges for some 200 responses to her URI, and Bob arranges
some for his. Alice's responses have the property that they are very
faithful to the entity. Each representation retrieved by GET
http://example.org/a is FRBR manifestation of the expression. The
expression and the Alice representations always have a lot in common -
they say the same things, are about the same things, have the same
paragraph and chapter structure, and so on. A representation has a
passage on whale lice if and only if the expression does, and so on.

Bob, meanwhile, arranges for 200 responses to http://example.org/b.
These representations have a different character. Some of them are
choice pages that link to online versions of the expression. Others
are offers to sell copies (items) of the expression. Others are
abridgements or adaptations (*different* expressions). Many of them
have ads or "related resources" links on them - such as summaries of
treatises on whale lice. Another is a table of contents that points to
one file per chapter. So these representations do *not* have as much
in common with the expression as the ones that Alice gives out.  They
can be about things that the expression is not about (maybe whale
lice), and they can fail to be about things that the expression is
about (the topics discussed in text removed by abridgment).

There is a very useful distinction to be made between the
Alice-representations and the Bob-representations - e.g. for text
mining.  As the two URIs supposedly name the same resource, the
distinction is not captured as a difference in the properties of the
named resource - one resource can't have different properties
depending on how it's named.

Possible resolutions:

1. The idea that the URI names the FRBR expression is silly. Get real,
these are just web pages, not weird metaphysical constructions. So the
special relationship of the nice Alice representations to
<http://example.org/a> is easy to express - it's just a property of
<http://example.org/a> that <http://example.org/b> doesn't have. If a
URI is needed to refer to the FRBR expression, that URI has to be a #
or 303 or something else.

2. Bob is doing something wrong ("bad practice"). If the served
representations are so different from the thing named, either the
representations [extra credit: which ones?] are wrong, or his URI
isn't going to be taken to name what he thinks it names.

3. No problem, the distinction between Alice's deployment and Bob's
can be expressed as properties of the URIs (or rather of what the
server(s) involved do with the two URIs), not the thing they name. We
might postulate an entity what-an-HTTP-origin-does-with-a-URI that
could be the possessor of the relevant properties (similar to Pat
Hayes 'computational doppelganger' [1] I think):
  [rdf:type awwsw:What-an-HTTP-origin-does-with-a-URI;
   awwsw:origin-host-name "example.org";
   awwsw:target-URI "http://example.org/a"^^xsd:anyURI ]
  rdf:type awwsw:FRBR-well-behaved.

4. We entertain distinct modes of discourse in which the URIs are
interpreted differently. In one mode we interpret the two URIs to be
the single FRBR expression. In another mode we interpret the URIs to
be distinct entities. (David's FTRR position?)

5. Suggest another  ______________________

So here's the poll: Which of these do you like best?

Just so you understand the consequences of your choice:
1. is a rejection of "information resource" and "URI ownership" i.e.
of most of AWWW and parts of RFCs 2616 and 3986
2. is such a strong requirement that # URIs (or 303 or blank nodes)
may be forced for many principled metadata subjects (similar to Alan
Ruttenberg's position and maybe the ABLP theory)
3. forces tedious explanation of the relation of a resource to 200
responses in any situation where someone might care; and is a
rejection of the idea that a corresponds-to assertion is falsifiable,
which is a premise of the httpRange-14 decision
4. rejects the 'semantic web' ideal that URI reference ought to be
context-insensitive and that you ought to be able to integrate
information coming from different sources in a straightfoward manner

Jonathan

[1] http://lists.w3.org/Archives/Public/public-semweb-lifesci/2007Oct/0059.html
[2] http://archive.ifla.org/VII/s13/frbr/frbr.htm

(p.s. I'm probably being sloppy in my application of FRBR; after all a
FRBR manifestation of Moby Dick can have advertising in it, and thus
be "about" whale lice even if the corresponding expression isn't. But
hoping you get the idea.)

I have no stake in any particular outcome, I just want a story that
makes sense, and nothing does right now. Sorry if I'm being dense or
fickle and thanks for staying with me. I hope I'm still bringing new
material.
Received on Monday, 29 March 2010 16:56:49 UTC