RE: Rough text for State finding

Dan, comments inline.

> -----Original Message-----
> From: Dan Connolly [mailto:connolly@w3.org]
<snip/>

> Many thanks for getting this started.
> 

Np.  Having tackled ext/versioning, application state and commensurate
reliability/scalability/performance seems like an obvious next
architecture thing for me to work on :-)

<snip/>
> > What is State
> 
> I think this section comes to early. I'm not a fan of definitions
> out of context in general, and it doesn't work for me in this
> case in particular. I suspect it will work better to start with
> the concrete story, accumulate 3 or 4 examples of state, and then
> generalize with a definition.
> 

Fair enough.  I can put the definition after the stories.

> > State is the data that pertains to an entity at a particular point
in
> > time.
> 
> At this point, with little context to worth with, questions
> come to my mind like:
>   is the diameter of the earth (a datum) pertinent to my homepage
>   (an entity) or not?
> 
> Er... and is 'entity' a synonym for webarch:resource? If so, please
> use resource instead (unless you're proposing that
> we s/resource/entity/g, which is perhaps worth discussing).

I want to avoid the 'resources are those things with URIs' trap.  If an
entity has state but doesn't have a URI, is it a resource?  

> >
> >
> > We see a prototypical stateful application from the client
> > perspective.  The application has 2 states: logged-in and
> > not-logged-in.
> 
> Hmm... to my mind, there's lots more state to this app: the state of
> my bank account is a bank balance, and probably a history of recent
> transcations. But you're right to leave that out of this discussion,
> as it's not really relevant to the open issues in web architecture.
> 

Excellent.  Let's treat the application state as the data that the
application chooses to expose, and my sample app didn't talk about those
things.

> A distinction that I think is perhaps more relevant is stateful
> vs stateless protocols. In a stateless protocol, each message
> carries all the relevant information with it; the server can
> service each incoming message independently and need not remember
> any state from one message to the next. In a stateful protocol,
> processing message N might require the server to remember
> something about messages previous to N.

I've been wondering about the difference and relationship between
resource state, entity state, protocol state, session state, and
application state.  I think there are layers of state, and multiple
layers of protocols, which can be mixed together to confuse things from
a modeling/layering perspective.  For example, I may have a stateful
application protocol (bank app), a stateless network session protocol
(http), and a stateful low level network data protocol (tcp). 

> 
> Anyway, let's be good engineers and steal where we can. The
> community consensus seems to be:
> 
> [[
> A stateless server is one which treats each request as an independent
> transaction, unrelated to any previous request. This simplifies the
> server design because it does not need to allocate storage to deal
with
> conversations in progress or worry about freeing it if a client dies
in
> mid-transaction. A disadvantage is that it may be necessary to include
> more information in each request and this extra information will need
to
> be interpreted by the server each time.
> 
> An example of a stateless server is a World-Wide Web server. These
take
> in requests (URLs) which completely specify the required document and
do
> not require any context or memory of previous requests.
> 
> Contrast this with a traditional FTP server which conducts an
> interactive session with the user. A request to the server for a file
> can assume that the user has been authenticated and that the current
> directory and transfer mode have been set.
> ]]
> 
>
http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?query=stateless&action=Sear
ch
> http://en.wikipedia.org/wiki/Stateless_server
> 
> 
> Clearly that bears refinement, since HTTP can be used in a stateful
> manner too. But that's what we should start with.
> 

I like the first sentence, despise the 2nd as it's woefully incomplete,
and am ok with the 3rd from the wikipedia definition.  As a whole, the
definition avoids the problem of a web server with a stateful session on
the server and a cookie containing the session id.  There *may be*
context and memory.  

An FTP message will contain some kind of session/conversation id in the
message that will be used by the ftp server to use the appropriate ftp
session, and an HTTP message with a cookie and session id will be used
by the HTTP session engine to use the appropriate http session.  

I think that one of the key differences between http and ftp is that the
ftp interactions are required to be stateful, whereas http supports both
stateless and stateful sessions.  I'm know that stateful http sessions
can be very scalable, and I bet that an "ftp servlet session" engine
could be designed to be just as scalable.  It would the same kind of
caching/session persistence etc. as current http session stores.  

Further, I wonder about the distributed system characteristics of an
HTTP GET of an HTML file and multiple images using HTTP Authentication
and HTTP KEEP-ALIVE as compared to a FTP MGET of a directory of that
same HTML file and images.  I have a feeling that there is no smoking
gun of perf/scale difference.

Cheers,
Dave

Received on Tuesday, 18 October 2005 17:07:50 UTC