Re: What is the Web?

From: Johan Hjelm (hjelm@w3.org)
Date: Fri, Mar 05 1999


Message-Id: <4.1.19990305172042.00c08630@127.0.0.1>
Date: Fri, 05 Mar 1999 17:34:17 +0100
To: "Lavoie,Brian" <lavoie@oclc.org>
From: Johan Hjelm <hjelm@w3.org>
Cc: "'www-wca@w3.org'" <www-wca@w3.org>
Subject: Re: What is the Web?

Actually, one obvious one presents itself: The union of the two. However,
there are problems whatever we say. What about FTP access to HTML pages,
and what about XML? We could talk about all interlinked pages, but then we
run into problems there, too. Is it the web if you have it on your own
machine, and just uses the file system to walk around in a hypertext space? 

I tried to define the web in one of the earlier drafts of the paper on
automatic recharacterisation, and came up with something like: 

The World Wide Web architecture is described in a set of specifications
outlining the markup language, the operations in the client on this markup
language, a request-response protocol for the communication between client,
proxy and server, and the logging of transactions.

That is of course not good enough either, which is why I took it out. 

More pragmatically, our characterisation is based on log files, so we might
want to stay with HTTP servers. 

Johan

At 11:16 1999-03-05 -0500, Lavoie,Brian wrote:
>WCA members,
>
>In working on the latest draft of the WCA terminology sheet, I noticed that
>there is one glaring omission from the list: a definition of the Web.
>
>Defining the Web is extremely important, because the Web definition has
>implications for other terms that we use, and also for various
>characterization metrics. For example, what is the approapriate definition
>of the size of the Web? Number of HTTP servers? Number of Web sites?
>Terabytes of information accessed by Web clients?
>
>We need to know exactly what we're talking about when we refer to the Web:
>what we are including, and equally important, what we are excluding.
>
>The way I see it, there are two potential approaches to defining the Web:
>The HTTP approach: the Web is the universe of information that can be
>accessed via the HTTP protocol. This is a server-centric interpretation.
>
>The HTML approach: the Web is the universe of information that can be
>accessed via hyperlinks (defined by the HTML standard). This is a
>client-centric interpretation.
>
>Both approaches have advantages and disadvantages. The HTTP approach I think
>provides a clearer delineation of Web-accessible information, but clearly
>excludes a lot of information that Web clients can access. The HTML approach
>captures this additional information, but runs the risk of expanding the Web
>to include virtually the entire Internet. Is there a middle ground?
>
>I'd like to solicit as many opinions as possible on this issue before the
>release of the next terminology draft.
>
>Thanks,
>
>Brian Lavoie
>OCLC

************************************************************
                     Johan HJELM
       Ericsson Research, User Applications Group 
         Currently visiting engineer at the W3C
             The World Wide Web Consortium
                     hjelm@w3.org
   http://www.w3.org/People/W3Cpeople.html#Hjelm
    Fax +1-617-258 5999, Phone +1-617-263-9630
   MIT/LCS, 545 Tech. Sq. Cambridge MA 02139 USA 
        opinions are personal, always my own, 
  and not necessarily those of Ericsson or the W3C. 
============================================================