Re: New terminology draft
From: Henrik Frystyk Nielsen (frystyk@w3.org)
Date: Tue, Mar 23 1999
Message-Id: <3.0.5.32.19990323165934.00a87800@localhost>
Date: Tue, 23 Mar 1999 16:59:34 -0500
To: "Lavoie,Brian" <lavoie@oclc.org>, "'www-wca@w3.org'" <www-wca@w3.org>
From: Henrik Frystyk Nielsen <frystyk@w3.org>
Subject: Re: New terminology draft
At 16:12 3/22/99 -0500, Lavoie,Brian wrote:
Hi Brian,
Thanks for taking the lead on this - here are my comments on your March 17,
1999 version:
> Primitive Elements
>
> File
> A collection of bytes, stored in a static medium, identified by a name
> and an extension (format).
I don't like using "file" as a unit at all. Resource is really more
flexible and we never really care about how that resource is stored
internally. There is no reason why an object stored in a database can't be
as (or more) static.
> Resource
> A network-accessible information unit, consisting of one or more
> files, which are collectively referenced by a Uniform Resource
> Identifier (URI).
A resource doesn't really consist of one or more files. Have you seen the
definition at
http://www.w3.org/WCA/1999/01/Terms.html#Resource
I think that is more generic.
> Resource Instantiation
> The state of a resource at a specific point in time and/or from a
> specific viewpoint.
> A conceptual mapping exists between a resource and a resource
> instantiation (or set of instantiations). A resource remains static
> even when its content i.e., the set of resource instantiations
> currently prevailing changes over time, provided that the conceptual
> mapping does not change.
> Example:
> A text file containing the previous day's closing price for Microsoft
> stock is a resource. The version of that file listing the closing
> price of Microsoft stock on March 15, 1999, is an instantiation of
> that resource.
For better or worse - HTTP calls an instantiation for an "entity" which
doesn't really say anything. However, I think we are better off sticking to
what people already are familiar with - I have a definition which is a bit
more general than the one HTTP uses (doesn't split "metadata" from "data"
as in entity body and entity header.
http://www.w3.org/WCA/1999/01/Terms.html#Entity
> Client
> A software application that initiates network communication.
I think that we here can be a little more specific, see for example:
http://www.w3.org/WCA/1999/01/Terms.html#Client1
> Server
> A software application that waits for network communication to be
> initiated.
and here as well:
http://www.w3.org/WCA/1999/01/Terms.html#Server1
The important thing is that "server" also covers the term "proxy" and other
intermediaries.
> Message
> A unit of communication exchanged between two peers (i.e., two units
> residing at the same network layer).
>
> Request
> A message containing an atomic operation to be carried out in the
> context of a specified resource.
>
> Response
> Zero, one or more messages containing the result of an executed
> request.
I propose using these definitions instead for message, request, and response:
http://www.w3.org/WCA/1999/01/Terms.html#Message
http://www.w3.org/WCA/1999/01/Terms.html#Request
http://www.w3.org/WCA/1999/01/Terms.html#Response
as they add a little more text on how requests and responses interact.
> User
> A human using a client to manually (interactively) retrieve
> network-accessible resources.
It is inherent in the Web model that a user agent always issues requests on
behalf of some human although it doesn't have to be directly. For example,
a robot still behaves on behalf of the human starting it. I wouldn't make
it a primitive. Instead I would like to define certain access patterns -
there is no reason why a browser can't become a robot while filling a cache
or using a robot to behave as a browser to download inlined images etc.
Instead I think we need to define the term "web page":
http://www.w3.org/WCA/1999/01/Terms.html#page
and "Web site" as well:
http://www.w3.org/WCA/1999/01/Terms.html#site
Both of my definitions are fairly close to yours further down in your list.
> The Scope of the Web
>
> Web Resource
> A resource that is accessible from the Internet, via the HTTP
> protocol.
The Web is really not limited to HTTP (nor HTML/XML for that matter). Those
are just popular ways of implementing the Web - the Web is really the
complete information space that can be referenced by URIs. That is,
anything that is a resource is on the Web. When you think about it, this is
really not limited to networked resources - however, this is how we
normally think about them. Examples of non-networked URIs are phone
addresses, for example:
http://www.w3.org/Addressing/schemes.html#phone
As we have already defined a "resource", we don't have to change that
definition.
> Web-accessible Internet Resource
> A resource, accessible from the Internet through a non-HTTP network
> protocol, that is referenced by a hyperlink embedded in a Web
> resource.
>
> Note that the definitions of Web resources and Web-accessible Internet
> resources both stipulate that the resource is available on the
> Internet. This is intended to exclude networks not connected to the
> Internet, such as non-TCP/IP networks, corporate intranets, and other
> private networks.
>
> The Web-accessible Internet resource definition addresses the fact
> that HTML, a key standard for Web resources, permits the direct
> linkage of non-HTTP-accessible resources from HTTP-accessible
> resources.
There are plenty of other formats that contain URIs - pdf, powerpoint, etc.
It is not limited to HTML. Again, I don't think we have to say anything
more than what we already have on resources.
> Web Clients
>
> Web Client
> A client that can be used to access Web resources.
We have already defined this as a "client" - it doesn't matter what
protocol it is really speaking nor whether it has a human clicking on the
mouse.
> Click
> A request by a user for the contents of a Web resource, identified by
> a URL. A click can take one of two forms:
> Explicit click: A click that is initiated manually by the user.
> Implicit click: A click that is initiated transparently by the client,
> without manual intervention on the part of the user, as an ancillary
> event corresponding to an explicit click.
Instead of defining "click" (which is also not very general - there are
many other ways of initiating a request) then I think we are in fact
already covered by the "web page" definition where we leave it to the user
preferences and/or application capabilities to decide which links are
dereferenced and which are not.
> Click-through Rate
> Frequency with which a Web resource, identified by a URL, is clicked.
Do you mean "Web page access rate"? That is, the (mean or distribution?)
time between changing web pages? Again, I think we should avoid the term
"click".
> User Session
> A cohesive set of user clicks across one or more Web servers.
What about "A set of Web pages accessed by continuous dereferencing of
links contained within these web pages. A session is not limited to a
single Web site"?
> Episode
> A subset of related user clicks that occur within a user session.
How is that distinguished from a session?
> Temporal Session Length
> The amount of time that elapses during the course of a user session.
>
> Session Path Length
> The number of clicks that occur during the course of a user session.
>
> Server Session
> A collection of user clicks to a Web server during a user session.
> Also called a visit.
What about:
The part of a user session limited to a single Web site.
> Server Path Length
> The number of user clicks to a Web server during a user session.
The following definitions are rather specific to HTTP - so maybe we should
introduce a special section of HTTP "stuff" including the equivalent sizes
for responses?
> Client Request Header Size
> The number of bytes in the HTTP header sent by a client requesting a
> Web resource.
>
> Client Request Content Size
> The number of bytes sent by a client delivering content to a Web
> server (e.g., the content of a "PUT" request).
>
> Total Client Request Size
> Client Request Header Size + Client Request Content Size
>
>
>
> Web Servers
I think we should put this under an "HTTP section" as well - we should
really keep the definition of a web site separate from HTTP - especially as
HTTP will change over time.
> Web Server
> A server that provides access to Web resources.
>
> Server Response Header Size
> The number of bytes transferred by a server in delivering an HTTP
> header, in response to a client request for a Web resource.
>
> Server Response Content Size
> The number of bytes transferred by a server in delivering the content
> of a requested Web resource.
>
> Total Server Response Size
> Server Response Content Size + Server Response Header Size
>
> Cookie
> Data sent by a Web server to a Web client, to be stored locally by the
> client and sent back to the server on subsequent requests.
I think I'll stop here for the first round of comments - maybe we can start
off discussing these first?
Henrik
--
Henrik Frystyk Nielsen,
World Wide Web Consortium
http://www.w3.org/People/Frystyk