Re: What is the Web?
From: Jim Pitkow (pitkow@parc.xerox.com)
Date: Fri, Mar 05 1999
Message-Id: <4.1.19990305103021.0383e1e0@mailback.parc.xerox.com>
Date: Fri, 5 Mar 1999 10:39:02 PST
To: Johan Hjelm <hjelm@w3.org>, "Lavoie,Brian" <lavoie@oclc.org>
From: Jim Pitkow <pitkow@parc.xerox.com>
Cc: "'www-wca@w3.org'" <www-wca@w3.org>
Subject: Re: What is the Web?
It seems to me that for the purposes of WCA characterization/performance,
we're primarily concerned with all things HTTP. That is to say,
characterizing XML traffic over a Java RMI client connection seems a bit
off topic for us. So, while a definition of the Web may say all things
HTTP &/or HTML/XML &/or URIs, for our purposes, we currently restrict our
scope to the systems that contain all three elements. Too restrictive?
At 08:34 AM 3/5/99 , Johan Hjelm wrote:
>Actually, one obvious one presents itself: The union of the two. However,
>there are problems whatever we say. What about FTP access to HTML pages,
>and what about XML? We could talk about all interlinked pages, but then we
>run into problems there, too. Is it the web if you have it on your own
>machine, and just uses the file system to walk around in a hypertext space?
>
>I tried to define the web in one of the earlier drafts of the paper on
>automatic recharacterisation, and came up with something like:
>
>The World Wide Web architecture is described in a set of specifications
>outlining the markup language, the operations in the client on this markup
>language, a request-response protocol for the communication between client,
>proxy and server, and the logging of transactions.
>
>That is of course not good enough either, which is why I took it out.
>
>More pragmatically, our characterisation is based on log files, so we might
>want to stay with HTTP servers.
>
>Johan
>
>At 11:16 1999-03-05 -0500, Lavoie,Brian wrote:
>>WCA members,
>>
>>In working on the latest draft of the WCA terminology sheet, I noticed that
>>there is one glaring omission from the list: a definition of the Web.
>>
>>Defining the Web is extremely important, because the Web definition has
>>implications for other terms that we use, and also for various
>>characterization metrics. For example, what is the approapriate definition
>>of the size of the Web? Number of HTTP servers? Number of Web sites?
>>Terabytes of information accessed by Web clients?
>>
>>We need to know exactly what we're talking about when we refer to the Web:
>>what we are including, and equally important, what we are excluding.
>>
>>The way I see it, there are two potential approaches to defining the Web:
>>The HTTP approach: the Web is the universe of information that can be
>>accessed via the HTTP protocol. This is a server-centric interpretation.
>>
>>The HTML approach: the Web is the universe of information that can be
>>accessed via hyperlinks (defined by the HTML standard). This is a
>>client-centric interpretation.
>>
>>Both approaches have advantages and disadvantages. The HTTP approach I think
>>provides a clearer delineation of Web-accessible information, but clearly
>>excludes a lot of information that Web clients can access. The HTML approach
>>captures this additional information, but runs the risk of expanding the Web
>>to include virtually the entire Internet. Is there a middle ground?
>>
>>I'd like to solicit as many opinions as possible on this issue before the
>>release of the next terminology draft.
>>
>>Thanks,
>>
>>Brian Lavoie
>>OCLC
>
>************************************************************
> Johan HJELM
> Ericsson Research, User Applications Group
> Currently visiting engineer at the W3C
> The World Wide Web Consortium
> hjelm@w3.org
> http://www.w3.org/People/W3Cpeople.html#Hjelm
> Fax +1-617-258 5999, Phone +1-617-263-9630
> MIT/LCS, 545 Tech. Sq. Cambridge MA 02139 USA
> opinions are personal, always my own,
> and not necessarily those of Ericsson or the W3C.
>============================================================
>