- From: Pat Hayes <phayes@ihmc.us>
- Date: Fri, 7 May 2004 18:30:43 -0500
- To: Dan Connolly <connolly@w3.org>
- Cc: public-webarch-comments@w3.org, w3c-rdfcore-wg@w3.org
- Message-Id: <p06001f14bcc009f7e3b8@[10.0.100.76]>
>On Wed, 2004-05-05 at 15:00, Pat Hayes wrote:
>> > On Wed, 2004-03-17 at 16:38, Pat Hayes wrote:
>[...]
>> > > Which could be paraphrased as "A resource can be anything, and
>> > > everything is a resource".
>> >
>> > yes, quite.
>>
>>
>> Well, then, it is hard to resist asking the question, why did y'all
>> feel obliged to (mis)-use a word when there already were perfectly
>> good words you could have used, such as "entity" or even the plainer
>> "thing" ?
>
>Fair question...
>
>The term 'resource' was chosen as the universal set in web developer
>discussions so old that I can't even find the archives.
>The usage goes back to at least 1994, with
>discussion of URLs, URIs, and even URCs (a precursor to RDF).
> http://lists.w3.org/Archives/Public/uri/1994Dec/
>
>The XLink spec followed suit, to some extent:
>
>[[
>The notion of resources is universal to the World Wide Web. [Definition:
>As discussed in [IETF RFC 2396], a resource is any addressable unit of
>information or service.] Examples include files, images, documents,
>programs, and query results.
>]]
> -- http://www.w3.org/TR/xlink/#N789
>
>I suppose that just re-raises the question of whether 'resource'
>is the universal set or something smaller.
Seems pretty clear to me that it is much smaller, by that term
'addressable unit'. Most things in heaven and earth are not
addressable units of anything.
That definition makes sense. It is clearly the C sense in my C/D
contrast, and it fits with the now venerable discussions of
hyperlinking in Engelbart's stuff, cited in the architecture
document. So why not just stick with that? Then it all makes perfect
sense as an architecture document. All you need to do is elide or
modify the few places where the text goes off on some kind of riff
about resources being anything and everything being a resource. None
of these have any bearing on the technical content of the document in
any case: everybody knows you can't send a book or a galaxy (or the
weather in Oaxaca) over a network, in fact, so who are y'all trying
to kid?
>I suppose we could choose 'thing' as the universal set, but that
>would be a pretty major educational undertaking in at least some
>communities. Maybe those communities are smaller than the ones
>that use 'resource' as the universal set.
Well, re-education is better than confusion. I don't think, frankly,
that "resource" was EVER supposed to mean the universal set. I'm not
sure now where the historical roots of that idea come from, but they
can't be found in the older material anywhere Ive looked. For
example, with the understanding of "resource" outlined in the Xlink
spec and the 1994 discussions you cite, which notably seem to think
of the Web as a kind of world-wide *library* , that choice of word
makes perfect sense: the things in a library, even an extended
world-wide hyperlinked electronic library, can indeed be reasonably
thought of as resources, things that are accessible for use, things
that provide functionality. The mistake arises when the library is
identified with the entire universe.
But OK, if you really want "resource" to mean the universal set, now,
then please say so very clearly and explicitly, but then read over
the document with that meaning in mind very critically, because with
that reading, much of what it says about resources really does not
make sense. You can't send weather over a network; grains of sand
don't have (or need) URIs, most resources have no owners, etc..
[later]
Maybe I have found the source.
A bit more reading found this,
http://ftp.ics.uci.edu/pub/ietf/uri/rfc1737.txt written by Sollins &
Masinter in Dec 1994, defining the URN idea and possibly one source
of this confusion that I am trying to root out. It has some nice
prose in it, let me number the sections for later comment:
1.
The requirements for Uniform Resource Names fit within the overall
architecture of Uniform Resource Identification. In order to build
applications in the most general case, the user must be able to
discover and identify the information, objects, or what we will call
in this architecture resources, on which the application is to
operate. Beyond this statement, the URI architecture does not define
"resource."
2.
A URN identifies a resource or
unit of information. It may identify, for example, intellectual
content, a particular presentation of intellectual content, or
whatever a name assignment authority determines is a distinctly
namable entity. A URL identifies the location or a container for an
instance of a resource identified by a URN. The resource identified
by a URN may reside in one or more locations at any given time, may
move, or may not be available at all. Of course, not all resources
will move during their lifetimes, and not all resources, although
identifiable and identified by a URN will be instantiated at any
given time. As such a URL is identifying a place where a resource
may reside, or a container, as distinct from the resource itself
identified by the URN.
3.
Requirements for functional capabilities
These are the requirements for URNs' functional capabilities:
o Global scope: A URN is a name with global scope which does not
imply a location. It has the same meaning everywhere.
o Global uniqueness: The same URN will never be assigned to two
different resources.
o Persistence: It is intended that the lifetime of a URN be
permanent. That is, the URN will be globally unique forever, and
may well be used as a reference to a resource well beyond the
lifetime of the resource it identifies or of any naming authority
involved in the assignment of its name.
o Scalability: URNs can be assigned to any resource that might
conceivably be available on the network, for hundreds of years.
....
4.
One of the ways in which naming authorities, the assigners of names,
may choose to make themselves distinctive is by the algorithms by
which they distinguish or do not distinguish resources from each
other. For example, a publisher may choose to distinguish among
multiple printings of a book, in which minor spelling and
typographical mistakes have been made, but a library may prefer not
to make that distinction. Furthermore, no one algorithm for testing
for equality is likely to applicable to all sorts of information.
For example, an algorithm based on testing the equality of two books
is unlikely to be useful when testing the equality of two
spreadsheets. Thus, although this document requires that any
particular naming authority use one algorithm for determining whether
two resources it is comparing are the same or different, each naming
authority can use a different such algorithm and a naming authority
may restrict the set of resources it chooses to identify in any way
at all.
Now, let me chew on this for a while. I claim that (1) makes sense
only under the C reading, since it refers to resources as things on
which *applications* can *operate*. You cannot operate on Santa
Clause or unborn children or remote galaxies or sodium atoms, or
indeed on the weather in Oaxaca, unless maybe you have a fleet of
cloud-seeding aircraft. So "object" here must be understood as
referring to a computational sense of "object", rather than entities
in general: things that (have an identity, but) can be stored in RAM,
sent over network connections, manipulated by software, tested for
equality under various criteria defined by methods inherited from
their defining class, etc.. These may well be objects in the sense of
"object" used in OOP, for example, but this is still a very small
(and highly special) subset of the universe of possible objects in
the broader, non-computational sense.
(That last sentence, BTW, seems to be an early example of the almost
obsessive stonewalling that the TAG group insists on doing when asked
to define its terminology. Do y'all think that there is some actual
intellectual merit in refusing to say what you mean in this way?
Maybe it makes it all feel more like pure mathematics, or something?)
Let me use 'computational object' , or CO, for what this paragraph
seems to be talking about, and 'object' more generally for, well,
anything in or under heaven.
The wording of (2) starts to get weird. We could take it either way.
It would be fine for URNs to be able to name anything, not just COs,
of course: sure, names can name anything. But for most nameable
things, the idea of their having an address, aka container, is
strange, to put it mildly: in many cases it isn't even coherent. (I
take it that this means 'address' as in location in memory or file
system, not like 'street address': but its a damned odd thing to say
in either sense about most things in the universe.) So this focus on
things changing their address or being moved from place to place
seems to be an implicit restriction of the topic to COs. And then
there is that curious remark about 'instances' of resources. What can
that possibly mean? Many of things we talk about are already concrete
particulars, which have no instances: what would be an 'instance' of,
say, you or me, or of the great nebula in Orion? But for COs it does
have a way of being interpreted in many cases, since COs are often
thought of as instances of a computational definition or
specification (the basic intuition underlying the OOP model in many
cases, where identity is determined by the class which provides the
'methods' of manipulation.) So again, if this means anything then it
seems to mean something in the context of COs rather than objects
more generally.
Turning to (3), this could at first be read in either sense, though
read in the wider, D, sense it seems almost insanely ambitious (there
shall be only One true Name for any object for Ever and Ever...
sounds like something from the Wizard of Earthsea) until you get to
the fourth condition, which refers to resources being "available on
the network". I can make sense of this only if "resource" is
understood in the computational C sense: things like galaxies are
not, in my experience, usually available on networks. Certainly the
things that my OWL quantifiers range over are not all on a network,
in any sense of 'on' and 'network' that I can understand: for
example, they may never (ever) have a identifier. But I don't think
that this is what the authors intended.
Section (4), from later in the document, seems to me to be muddled.
Is it talking about criteria for identity or about *algorithms* for
determining identity? The text uses the latter terminology but the
examples only make sense in the former. In the usual sense of
equality used outside computer science, the idea of an algorithm for
testing for equality is incoherent: equality simply means being the
same thing. (What kind of algorithm could test me for equality with
myself? What would need to be tested? Here I am, and there is one of
me: what else needs to be said or done?) This whole discussion makes
sense only if one thinks of the actual world as being a kind of
algebraic object, like a giant datastructure, whose pieces look
different depending on which category you think of them as belonging
to. This is a fatal muddling of reality with description, a confusing
of the map with the territory, of the model with the modelled.
I think I see what the authors are getting at: the concept of 'book'
can be interpreted as meaning an abstraction of the actual world at
different levels: physical objects on a shelf, print runs, editions,
manuscripts, etc., all might be called "a book" and might have
systematic identifiers assigned to them by various naming
authorities: yes, indeed. But this only translates into criteria for
*identity* if one thinks of these abstractions as *being* the
reality: and that is a fatal slip, like identifying mathematical
abstractions with the reality that they might model, or the Platonic
world of ideals with the actual world of concrete referents, or (as I
think in this case) the computational descriptions of something with
the things described. And once that mistaken identification is made,
it seems trivial to identify reference with linking, and naming with
access. The whole world now seems to be made of the stuff that is
sent over network links and processed by applications running on
computers.
>As to 'entity'... the web specs (MIME, HTTP) use "entity" to mean
>a generalization of the RFC822 header/body structure.
So they do. Sigh. OK, so just say 'thing'. Plain language is
usually best in any case.
>"entity
> The information transferred as the payload of a request or
> response."
> -- http://www.w3.org/Protocols/rfc2616/rfc2616-sec1.html#sec1
>
>Hmm... that spec seems to use 'resource' for something smaller
>than the universal set...
>
>"resource
> A network data object or service that can be identified by a
> URI, as defined in section 3.2."
You omitted the next sentence, which is even more telling:
"Resources may be available in multiple representations (e.g.
multiple languages, data formats, size, and resolutions) or vary in
other ways."
>then in 3.2:
>
>"As far as HTTP is concerned, Uniform Resource Identifiers are simply
>formatted strings which identify--via name, location, or any other
>characteristic--a resource."
>
This is the familiar circular recursion that one finds in so many of
the W3C glossaries. Foodles are whatever are identified by bazzies,
and bazzies identify foodles. Thanks, that is SO helpful :-)
Pat
>[... much elided ...]
>
>--
>Dan Connolly, W3C http://www.w3.org/People/Connolly/
>see you at the WWW2004 in NY 17-22 May?
--
---------------------------------------------------------------------
IHMC (850)434 8903 or (650)494 3973 home
40 South Alcaniz St. (850)202 4416 office
Pensacola (850)202 4440 fax
FL 32501 (850)291 0667 cell
phayes@ihmc.us http://www.ihmc.us/users/phayes
Received on Friday, 7 May 2004 19:30:54 UTC