Re: Terminology Question concerning Web Architecture and Linked Data from Sandro Hawke on 2007-07-26 (www-tag@w3.org from July 2007)

From: Sandro Hawke <sandro@w3.org>
Date: Thu, 26 Jul 2007 11:39:19 -0400
To: "John Black" <JohnBlack@kashori.com>
Cc: "'Linking Open Data'" <linking-open-data@simile.mit.edu>, "SW-forum" <semantic-web@w3.org>, www-tag@w3.org
Message-ID: <28419.1185464359@ubuhebe>
"John Black" <JohnBlack@kashori.com> writes:
> Sandro Hawke wrote:
> >
> > "Hans Teijgeler" <hans.teijgeler@quicknet.nl> writes:
> >> To me the distinction between information and non-information resources 
> >> is
> >> non-existing, because what you call a non-information resource actually
> >> contains information as well
> >
> > But it doesn't contain *only* information.  Information Resources are
> > things which can be entirely and completely encoded as bits and then
> > transmitted over a network.  They can be copied, perfectly.  They can be
> > serialized.  They are pure information.  (Another name I suggested for
> > this class was "Digital Artifact", but the TAG went with "Information
> > Resource" instead.)
> >
> > That, it seems to me, is a fairly crisp and useful class to define when
> > talking about computer systems like the Web.
> >
> >    -- Sandro
> 
> This is what I used to think. But, apparently, this is not the case. What 
> you are talking about are representations of the resource. Each 
> representation of the resource is "pure information" that can be perfectly 
> copied. But remember that the representation is not the resource. The 
> resource is the *source* of the representations. So while any one 
> representation may be copied perfectly, or even any stream of 
> representations over some time interval may be copied perfectly, yet these 
> are not the resource being represented. The resource, that which is the 
> source of the representations, cannot be copied. How could it? You would 
> have to know all future representations that resource will produce, and in 
> what sequence.

True.  I was over simplifying and forgetting some details of this
debate.  Specifically, the detail I was ignoring was mutablity, or
change-over-time and change-with-observer.

A better model is that an Information Resources is a location which can
hold a digital artifact.  Which one it holds may change from one moment
to the next.  This is the same as a computer file, or even a physical
file (if we only consider the information placed into the file folder,
not the physical artifacts used to hold that information).

Of course, the "location" is handled by a computer, so not only can it
change the information stored there contually, it can change it based on
who is asking -- based on their credentials, their IP address, their
cookies, etc.

An information resource, then, is like an area of a wall, where there
might be some writing, or some pictures, etc.  In many cases it's
painted, and probably wont ever change.  The wall may eventually go
away.  In some cases, it's an animated display, and changes
moment-to-moment.  An other cases, it uses sophisticated magic to look
different to each person looking at it.  (of course it's also touch
sensitive, or something, allowing each person to give it input, too.
The physical intuition breaks down here.  Maybe it's a kiosk which
magically grows a new screen for each person looking at it.)

A non-information resource is anything that's not like a sign on a
wall.   It's people, animals, toys, food, bricks, abstract ideas (the
number one, the idea of revolution), etc.   A book you hold in your hand
is not an information resource -- that same text could be printed on one
of these walls (aka a web page) and then it (the text) would be an IR.

So, in this metaphor, a URI is something you hand a guide, and the guide
will show you the relevant spot on a wall.   If you give a URI for a
non-IR, then the best the guide can do is show you a spot on the wall
which talks about that non-IR.  (That is, it can do a 303.)

Along this more sophisticated model, one of my prefered terms (instead
of Information Resource) was "Response Point".     But this is all
pretty darn fuzzy, and a hard subject on which to reach consensus.

   *        *         *

Really, I think should probably just call them "web pages".   (I know
some people have some ideas about Information Resources which are not
Web Pages.  I'm not convinced.)

So:

        Information Resource == Web Page.   
        Non-Information-Resource == Anything that's not a Web Page.

(And while we're at it, call then "Web Addresses" not "URIs".)

So, one of the funky Semantic Web ideas is to give Web Addresses (or
Pseudo-Web-Addresses) to things which are *not* Web Pages.  Huh?  This
sounds a little weird, especially if you try to call them real Web
Addresses, but via some tricks it kind of works.  It lets you talk about
things in a way where the listener can find out more information if they
want it.

Humans are getting used to this with Google.  If I hear a term I don't
understand, I can often Google it faster than I can ask the speaker to
explain it.  Especially if it's in a written document.  (Of course,
Google just makes it faster and easier -- it's always been possible to
do research.)  Using URIs (pseudo-web-addresses) instead of search terms
has some advantages and some disadvantages; I think it's a good plan,
myself. 

    -- Sandro
Received on Thursday, 26 July 2007 15:40:47 UTC