Re: Terminology Question concerning Web Architecture and Linked Data from John Black on 2007-07-27 (semantic-web@w3.org from July 2007)

From: John Black <JohnBlack@kashori.com>
Date: Fri, 27 Jul 2007 00:03:13 -0400
To: "Sandro Hawke" <sandro@w3.org>
Cc: "'Linking Open Data'" <linking-open-data@simile.mit.edu>, "SW-forum" <semantic-web@w3.org>, <www-tag@w3.org>
Message-ID: <06ad01c7d003$0eb4fcd0$6601a8c0@KASHORI001>
Sandro Hawke wrote:
> "John Black" <JohnBlack@kashori.com> writes:
>> Sandro Hawke wrote:
>> >
>> > "Hans Teijgeler" <hans.teijgeler@quicknet.nl> writes:
>> >> To me the distinction between information and non-information 
>> >> resources
>> >> is
>> >> non-existing, because what you call a non-information resource 
>> >> actually
>> >> contains information as well
>> >
>> > But it doesn't contain *only* information.  Information Resources are
>> > things which can be entirely and completely encoded as bits and then
>> > transmitted over a network.  They can be copied, perfectly.  They can 
>> > be
>> > serialized.  They are pure information.  (Another name I suggested for
>> > this class was "Digital Artifact", but the TAG went with "Information
>> > Resource" instead.)
>> >
>> > That, it seems to me, is a fairly crisp and useful class to define when
>> > talking about computer systems like the Web.
>> >
>> >    -- Sandro
>>
>> This is what I used to think. But, apparently, this is not the case. What
>> you are talking about are representations of the resource. Each
>> representation of the resource is "pure information" that can be 
>> perfectly
>> copied. But remember that the representation is not the resource. The
>> resource is the *source* of the representations. So while any one
>> representation may be copied perfectly, or even any stream of
>> representations over some time interval may be copied perfectly, yet 
>> these
>> are not the resource being represented. The resource, that which is the
>> source of the representations, cannot be copied. How could it? You would
>> have to know all future representations that resource will produce, and 
>> in
>> what sequence.
>
> True.  I was over simplifying and forgetting some details of this
> debate.  Specifically, the detail I was ignoring was mutablity, or
> change-over-time and change-with-observer.

And that makes is far less crisp, I'm afraid.

> A better model is that an Information Resources is a location which can
> hold a digital artifact.  Which one it holds may change from one moment
> to the next.  This is the same as a computer file, or even a physical
> file (if we only consider the information placed into the file folder,
> not the physical artifacts used to hold that information).
>
> Of course, the "location" is handled by a computer, so not only can it
> change the information stored there contually, it can change it based on
> who is asking -- based on their credentials, their IP address, their
> cookies, etc.
>
> An information resource, then, is like an area of a wall, where there
> might be some writing, or some pictures, etc.  In many cases it's
> painted, and probably wont ever change.  The wall may eventually go
> away.  In some cases, it's an animated display, and changes
> moment-to-moment.  An other cases, it uses sophisticated magic to look
> different to each person looking at it.  (of course it's also touch
> sensitive, or something, allowing each person to give it input, too.
> The physical intuition breaks down here.  Maybe it's a kiosk which
> magically grows a new screen for each person looking at it.)

But in any case, magic notwithstanding, these "areas" or "locations", be 
they walls or computer folders, cannot "...be entirely and completely 
encoded as bits and then transmitted over a network..." nor can they be 
"...copied, perfectly." nor can they be serialized. And they are certainly 
not "pure information". As a matter of fact, their essence surely includes 
the intentionality behind the ever changing content emitted. After all, to 
use your example of a wall, writing or pictures don't just appear out of 
nowhere, they are put there, probably deliberately, for a purpose. Or as Roy 
Fielding said the other day, "...what matters is what the link author 
intends by the reference, not what the link resolves to at any given moment 
in time."[1] These are things that cannot be transmitted over a computer 
network any more than cars, people, animals, or abstract ideas. And if 
"information resources" are thus just like "non-information resources" in 
that they cannot be transmitted, then the whole distinction vanishes.

> A non-information resource is anything that's not like a sign on at
> wall.   It's people, animals, toys, food, bricks, abstract ideas (the
> number one, the idea of revolution), etc.   A book you hold in your hand
> is not an information resource -- that same text could be printed on one
> of these walls (aka a web page) and then it (the text) would be an IR.

Sorry, but I still find this incomprehensible, that the text of a book is 
not an information resource, but if that same text is encoded in a computer 
file on a web server, it now somehow becomes an information resource. Or 
what I found just as surprising was when Tim said, "...a literal string is 
not an information resource." [2] about a text string in a static web page 
served by a web server, here http://kashori.com/ontology/MyURI. Honestly, 
I'm not debating here, I just don't get it.

> So, in this metaphor, a URI is something you hand a guide, and the guide
> will show you the relevant spot on a wall.   If you give a URI for a
> non-IR, then the best the guide can do is show you a spot on the wall
> which talks about that non-IR.  (That is, it can do a 303.)

When used by an agent in the context of the semantic web, that URI is a 
name, used to refer to a resource. Either I know what that agent is 
referring to by that URI or I don't. If I know what the agent denotes by 
that name (URI), then I don't need to be taken to the wall at all. Or if I 
don't know what resource the agent refers to by that URI, then it won't help 
to be put in front of a stream of representations, because the 
representations are not the resource, and unless you know the nature of the 
resource, you can't know whether the received representations reveal that 
nature or not. And this is true both in the case of information or 
non-information resources, because the essence of neither can be transmitted 
over a network, as I argue above. In either case, information or 
non-information, what I really want to hear is a definite description of the 
resource and to be told that other agents do associate that description with 
the name (URI) used.

By the way, in the current scheme, where am I supposed to go for a good 
description of, rather than the direct experience of, an information 
resource?

> Along this more sophisticated model, one of my prefered terms (instead
> of Information Resource) was "Response Point".     But this is all
> pretty darn fuzzy, and a hard subject on which to reach consensus.
>
>   *        *         *
>
> Really, I think should probably just call them "web pages".   (I know
> some people have some ideas about Information Resources which are not
> Web Pages.  I'm not convinced.)
>
> So:
>
>        Information Resource == Web Page.
>        Non-Information-Resource == Anything that's not a Web Page.
>
> (And while we're at it, call then "Web Addresses" not "URIs".)
>
> So, one of the funky Semantic Web ideas is to give Web Addresses (or
> Pseudo-Web-Addresses) to things which are *not* Web Pages.  Huh?  This
> sounds a little weird, especially if you try to call them real Web
> Addresses, but via some tricks it kind of works.  It lets you talk about
> things in a way where the listener can find out more information if they
> want it.
>
> Humans are getting used to this with Google.  If I hear a term I don't
> understand, I can often Google it faster than I can ask the speaker to
> explain it.  Especially if it's in a written document.  (Of course,
> Google just makes it faster and easier -- it's always been possible to
> do research.)  Using URIs (pseudo-web-addresses) instead of search terms
> has some advantages and some disadvantages; I think it's a good plan,
> myself.
>
>    -- Sandro
>
1. http://lists.w3.org/Archives/Public/www-tag/2007Jul/0112.html
2. http://lists.w3.org/Archives/Public/semantic-web/2007Jun/0265.html

John
www.kashori.com
Received on Friday, 27 July 2007 04:03:37 UTC