Re: New draft TAG Finding on The Self-Describing Web

On 5/30/07, Dan Brickley <danbri@danbri.org> wrote:

> I guess the idea is of document-like-things that are (in principle at
> least) losslessly (or 'essentially losslessly', whatever that means)
> representable as a stream of 0s and 1s.

This is a good definition because it's fairly clear - it rules out
cats and quarks, and a large number of other things that I can figure
out.  But it sounds like you're talking about strings or numbers or
"representations" - not things that vary along multiple dimensions
such as time and language (I'm referring to webarch and other dogma
here).

>
> I have long been extremely uncomfortable with the idea of "Information
> Resources" being baked into webarch at any deep level

I agree...

> > Anyhow, Google is the URI owner and gets to decide what the URI
> > denotes; so who are we to be talking about the essences of Google's
> > resources?
>
> ..ooOO("When I use a word," Humpty Dumpty said, in a rather scornful
> tone, "it means just what I choose it to mean - neither more nor less." :)

Not according to my reading of webarch 2.2.2.1.  That says that a word
means what its owner says it means. If the behavior of a web server is
to be interpreted as providing implicit statements about the referent
of a URI, well... as I've said, how exactly are we to interpret this
behavior? I don't think it's spelled out anywhere, and doing so is
difficult because servers can be wrong, whimsical, inconsistent,
devious, etc. Any inference we draw from behavior has to be either a
Popperian hypothesis based on evidence and/or theory, or
contract-based - the server promises something and we choose for
whatever reason to believe the promise. In neither case can we make
any general statements about "information resources" - only particular
statements based on evidence or belief that's limited to a particular
situation.

Given the number of deployed web servers it's hard to believe we'll be
able come up with any useful general inference schemes based on what
they do.

> > If we know independently what a URI denotes, and have an
> > objective definition of "information resource", then we can take
> > stands on the information-resourceness of the denoted resource.
> > Otherwise it's an exercise in futility, and instead we should just be
> > talking empirically about URI's and HTTP experiences.
>
> Yes please.

I certainly sympathize with the urge to keep feet on the ground, but I
think there is a need for a type (or class), or several of them, that
can be used for nonsense checking.  If I'm writing RDF, and make an
assertion that can only make sense for something that's
information-resource-like (maybe because you have to be able to GET
it, or copy it, or say what language it's written in, or something),
it would be nice to be able to verify the type-correctness of that
assertion to some extent. That is, if I assert that R is 52 octets
long, or that R's author is Charles Darwin, I want to be told that
there's a problem if there's also reason to believe that R is a cat,
or R is Socrates (belongs to a disjoint class).

This may not be a single type.  Maybe gettable (accessible) is
different from copyable or lengthable, etc. But it would be
unfortunate if there were too many such types.

Currently of course I'm using foaf:Document, but it always makes me
squirm. <http://news.google.com/> a foaf:Document . just seems weird
to me. I don't know how to translate it into sensible English, and I
still don't really know whether it's supposed to be true or not.

Perhaps the concept(s) is not really at home in a technical document
about web architecture, but it might make sense in some well reasoned
theory of the web & the semantic web (whether or not they're
different). If we had such a theory we might be able to write useful,
falsifiable RDF (and prose) that uses it.

Jonathan

Received on Wednesday, 30 May 2007 17:16:48 UTC