HTTP 1.1 document terminology. from jg@w3.org on 1996-04-29 (ietf-http-wg@w3.org from April to June 1996)

From: <jg@w3.org>
Date: Mon, 29 Apr 96 11:50:32 -0400
To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <9604291550.AA07157@zorch.w3.org>
As you all know, we've had terminology problems that has caused much
anguish, where different people have had slightly different views of
what the same term means (and those people then ended up generating
new terminology to compensate).  We only recently figured out this 
disconnect was the cause of much of the anguish of the last month.
The terms "resource", "variant", "entity", "entity instance" were being
used in sloppy ways (and as there are only three things to be described,
the fact we were using four is a fundamental sign of the problem).
It would help the discussions to try to be precise in terminology.

Roy Fielding took a stab at a set of terminology and definitions to help
this situation.  I've also borrowed from some terminology Tim Berners-Lee
had put together.  What is below is derived mostly from either one or 
the other, or from previous usage in the draft, where it had not been
confused.  Note the introduction of resource entity; when working
through the spec, it became clear that Roy was correct on needing that term
as well.

After quite a bit of anguish on several people's part, I think we're
converging.


There are two areas which aren't quite settled to my complete satisfaction.

	1) Roy has been pushing for "entity ID" or "entity
identifier" to replace opaque validators in the document; 
having been wandering through the document wrestling with
these terminology issues, I've gotten more than a glimmer of what he's
talking about, and now agree that a term independent of validator is
in fact correct, and probably necessary to avoid confusion in the long
run.  The problem that ID is confusing too many people.  This doesn't
mean the term "entity ID" is good, or bad, or indifferent, just that
too many people have too many preconceptions about what identifier
means.  It was suggested was that we use "tag" for this term.  This
raises a problem: the document already talks about language tags, and
this term is bound in other RFC's not under our control.  
I don't think it wise to try to rename "language tag" to something 
else, due to this other use in IETF terminology.  
Unless someone can come up with a better term (fast) I
plan to use the term "entity tag" in the document, rather than just
tag.  IfValid, and IfInvalid become IfMatch and IfNoMatch, and so
forth.

	2) I like Tim's term of "generic resource" for a resource that
you can do content negotiation against (and I incorporated some words
of his into the terminology section below), rather than Roy's
suggestion of group resource.  Tim uses "specific resource" for a
resource that has only a single entity associated with it; I'm less
happy about that.  Roy's suggestion is "individual resource", with
which I'm also luke warm.  I'll go with "specific resource" unless
someone argues me out of it or suggests a better term.  Similarly, if
someone has a bit of blinding insight for a better term, please speak
up now.

As editor, it is my responsiblity to see that the resulting document
reflects a single uniform set of terminology that won't confuse too
many people.  Any suggestions on the above two points (and if the
defn's below have problems), would be appreciated.  I will attempt to
make the next draft conform to this terminology.

			- Jim Gettys





1.3 Terminology
===============
This specification uses a number of terms to refer to the roles played
by participants in, and objects of, the HTTP communication.

connection
----------
A transport layer virtual circuit established between two application
programs for the purpose of communication.

message
-------
The basic unit of HTTP communication, consisting of a structured
sequence of octets matching the syntax defined in section 8 and
transmitted via the connection.

request
-------
An HTTP request message (as defined in section 9).

response
--------
An HTTP response message (as defined in section 10).

resource
--------
A network data object or service that can be identified by a URI
(section 7.2). A "resource" is a concept (a little like a
Platonic ideal).  When represented electronically, a resource may be
of the kind which corresponds to only one possible bit stream
representation.  An example is the text version of an Internet
RFC. That never changes. It will always has the same checksum.

generic resource
----------------
On the other hand, a resource may be generic in that as a concept it
is well specified but not so specifically specified that it can only
be represented by a single bit stream.  In this case, other URIs may
exist which identify a resource more specifically. These other URIs
identify resources too, and there is a relationship of genericity
between the generic and the relatively specific resource.  As an
example, successively specific resources might be

1. The Bible
2. The Bible, King James Version
3. The Bible, KJV, in English
4. A particular ASCII rendering of the KJV Bible in English

Each resource may have a URI.  The authority which allocates the URI
is the authority which determines to what it refers: Therefore, that
authority determines to what extent that resource is generic or
specific.  When we discuss electronic resources, an interesting fact
is that a small number of dimensions of genericity emerge.

Time
A resource may vary with time. For example, "The Wall Street Journal"
varies with time. Each issue is a time-specific resource, which does
not change with time. Most home pages on the Web change with time, in
a less periodic way.

Language
When a document is translated, it is useful to be able to refer to it
either in the generic, or to a particular specific translation.

Representation
A given resource may have many ways in which it can be represented on
the wire, using different Content-Types (section 18.19).  As an
example, an image may be represented in PNG or GIF format.

specific resource
-----------------
A resource that is not a generic resource.

variant
-------
An specific resource that is a member of at least one generic
resource.  Sometimes called a resource variant.  Note that the set of
variants of a generic resource may change over time as well.

entity
------
The set of information transferred as the payload of a request or
response.  An entity consists of metainformation in the form of
Entity-Header fields and content in the form of an Entity-Body, as
described in section 11 

resource entity
---------------
A particular representation, rendition,
encoding, or presentation of a resource at a particular point in time.
Resources not supporting content negotiation are bound to a single
entity.  Generic resources supporting content negotiation are bound to
a set of one or more entities, whose membership may vary over time.

content negotiation
-------------------
The mechanism for selecting the appropriate variant of a generic
resource when applying a request, is described in section 15.

entity tag
----------

An identifier for an entity.  An identifier for a particular entity is
called a "strong entity tag."  An identifier for an equivalent set of
entities is called a "weak entity tag."

client
------
An application program that establishes connections for the purpose of
sending requests.

user agent
----------
The client which initiates a request. These are often browsers,
editors, spiders (web-traversing robots), or other end user tools.

server
------
An application program that accepts connections in order to service
requests by sending back responses. Any given program MAY be capable
of being both a client and a server; our use of these terms refers
only to the role being performed by the program for a particular
connection, rather than to the program's capabilities in
general. Likewise, any server MAY act as an origin server, proxy,
gateway, or tunnel, switching behavior based on the nature of each
request.

origin server
-------------
The server on which a given resource resides or is to be created.

proxy
-----
An intermediary program which acts as both a server and a client for
the purpose of making requests on behalf of other clients. Requests
are serviced internally or by passing them, with possible translation,
on to other servers. A proxy MUST interpret and, if necessary, rewrite
a request message before forwarding it. Proxies are often used as
client-side portals through network firewalls and as helper
applications for handling requests via protocols not implemented by
the user agent.

gateway
-------
A server which acts as an intermediary for some other server. Unlike a
proxy, a gateway receives requests as if it were the origin server for
the requested resource; the requesting client may not be aware that it
is communicating with a gateway. Gateways are often used as
server-side portals through network firewalls and as protocol
translators for access to resources stored on non-HTTP systems.

tunnel
------
A tunnel is an intermediary program which is acting as a blind relay
between two connections. Once active, a tunnel is not considered a
party to the HTTP communication, though the tunnel may have been
initiated by an HTTP request. The tunnel ceases to exist when both
ends of the relayed connections are closed. Tunnels are used when a
portal is necessary and the intermediary cannot, or should not,
interpret the relayed communication.

cache
-----
A program's local store of response messages and the subsystem that
controls its message storage, retrieval, and deletion. A cache stores
cachable responses in order to reduce the response time and network
bandwidth consumption on future, equivalent requests. Any client or
server MAY include a cache, though a cache cannot be used by a server
while it is acting as a tunnel.

firsthand
---------
A response is firsthand if it comes directly and without unnecessary
delay from the origin server, perhaps via one or more proxies.  A
response is also firsthand if its validity has just been checked
directly with the origin server.

explicit expiration time
------------------------
The time at which the origin server intends that an entity should no
longer be returned by a cache without further validation.

heuristic expiration time
-------------------------
An expiration time assigned by a cache when no explicit expiration time 
is available.

age
---
The age of a response is the time since it was generated by, or
successfully validated with, the origin server.

freshness lifetime
------------------
The length of time between the generation of a response and its
expiration time.


fresh
-----
A response is fresh if its age has not yet reached its freshness lifetime.

stale
-----
A response is stale if its age has passed its freshness lifetime.

validator
---------
A tag or a Last-Modified date and time used to validate whether a cache 
entry is "fresh" or "stale."
Received on Monday, 29 April 1996 08:55:19 UTC