Re: [httpRange-14]: New Draft Finding "Dereferencing HTTP URIs" from Jonathan Rees on 2007-05-31 (www-tag@w3.org from May 2007)

From: Jonathan Rees <jonathan.rees@gmail.com>
Date: Thu, 31 May 2007 11:22:08 -0400
To: "Williams, Stuart (HP Labs, Bristol)" <skw@hp.com>
Cc: www-tag@w3.org, "Rhys Lewis" <rhys@volantis.com>
Message-ID: <3cff5e070705310822q775141cwf5e9dd441573b40@mail.gmail.com>
I'm sorry I started a discussion of this document on the wrong thread
("New draft TAG Finding on The Self-Describing Web"). I'll reiterate
some of what's been said there, and make specific comments on this
document. Maybe others will jump ship from the other thread and free
it up for what it was meant for. Until then I hope you'll take a look
at what's been said there.

"The World Wide Web (WWW, or simply Web) is an information space in
which the items of interest, referred to as resources, are identified
by global identifiers called Uniform Resource Identifiers (URI)."
-- I know you copied this from AWWW, but it's misleading. The intent,
as reflected in AWWW 2.2, is only that resources *might be* so
identified, not that they *are*. In particular, a model of the
"information space" that has an uncountable number of resources would
be perfectly consistent, even though there are only countably many
URI's.

I think it's also confusing to mix up the web, the semantic web, and
their respective namespaces and domains of discourse. The semweb's
domain of discourse includes stars; the semweb is supposed to be part
of the web; you say the web's an information space; does that mean
stars are information? You could say that  the web and semweb are
information spaces (spaces filled up with information), but their
"items of interest" (web pages, ontologies, messages,
"representations", RDF triples, etc) do not coincide with things
identified by URI's (stars, people, web pages, etc.); the two sets
overlap, but neither is contained in the other.

I know you need an introduction but you don't want to say anything
you'll have to retract later.

"Information resources" - the definition in terms of "essential
characteristics" is useless, since it is neither objective, accurate,
or precise. We don't need a rigorous definition, just one that helps
us to distinguish IR's from non-IR's most of the time, and perhaps to
answer the question of when distinct URI's denote the same IR. Several
alternative definitions have been proposed, such as John Cowan's "a
resource that we are willing to identify with its representations" and
David Booth's "a networked source of representations". (See
http://wiki.neurocommons.org/InformationResource .) The requirement
for a definition should be admitted first; the actual definition
itself is less important.

"Representation" should similarly be defined, and representation and
information resource should be defined noncircularly. Pat Hayes points
out that the term is used in AWWW at variance to common parlance (a
photo can be a representation of a dog, but in AWWW only IR's have
representations?). If we must have a confusing technical definition,
so be it, as long as it's precise enough to be useful for something.

It would be nice if there were agreed-upon RDF types for IR's and
representations. Maybe foaf:Document is the same as IR, although I
understand there was some distinction made between the two.

"Information resources make up the vast majority of the Web today." --
this sentence doesn't make sense. What's something that helps make up
the Web that's not an information resource? Depending on how you
define the Web, I would think the answer should be either nothing (Web
= networked information resources?) or almost everything (Web = a vast
interconnection of people and services communicating using common
protocols).

About 303's - I think this is the place to correct the error in the
httpRange-14 finding that implies that a 200 response *determines*
that a resource is an IR.  This is ridiculous; it is only an
*assertion* that it is. We all know that Pat Hayes's URI doesn't
denote an information resource; he told us so himself, and according
to AWWW has the authority to do so. His server is simply
misconfigured.

FYI I use 303 redirects for some resources that either are information
resources or might be. The ones that are IR's are things that I simply
haven't gotten around to implementing (such as records extracted from
databases), or things that might or might not be IR's depending on
whose definition you accept. The 303 redirects to a document that
documents the situation and gives you enough information to go and
find the thing yourself.

Also note that for program (as opposed to human) use, a 303 is pretty
much useless if you don't know anything about the relationship between
the named resource and the referenced resource, or about the type of
the referenced resource. Unless this is fixed, semantic web
applications will have to come up with their own techniques for
exploiting 303's - assumptions based on the host name or form of the
URI, or using an external database recording such relationships, or
other prior knowledge of the resource and/or reference. This is quite
un-webby, but so it goes - semweb is still a second class citizen.

Jonathan Rees
Science Commons

On 5/24/07, Williams, Stuart (HP Labs, Bristol) <skw@hp.com> wrote:
>
> A new draft TAG finding, "Dereferencing HTTP URIs" is available for
> review at:
>
>
> http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14
>
> The intention has been to develop a TAG finding based around the TAG's
> resolution[1,2] of httpRange-14[3].
>
> Please send comments to www-tagw3.org.
>
> Regards,
>
> Stuart Williams
> co-Chair W3C TAG
>
> [1] http://www.w3.org/2001/tag/2005/06/14-16-minutes#item023
> [2] http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html
> [3] http://www.w3.org/2001/tag/issues#httpRange-14
> --
> Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks
> RG12 1HN
> Registered No: 690597 England
>
>
>
Received on Thursday, 31 May 2007 15:22:29 UTC