Foundational Web Model(s) [was: Re: Comments about http://www.w3.org/DesignIssues/Architecture#Conten t : is GET the only idempotent method] from Massimo Marchiori on 2002-01-11 (www-tag@w3.org from January 2002)

From: Massimo Marchiori <massimo@w3.org>
Date: Fri, 11 Jan 2002 12:42:29 -0500
To: timbl@w3.org
Cc: mike@dataconcert.com, www-tag@w3.org
Message-Id: <200201111742.MAA04635@tux.w3.org>
Tim, I've several comments here. Note, I don't know to what level of 
"foundation" we want to push this, but it can be an interesting
discussion (esp. in relation to the Semantic Web, see later).

<premise>
Without rigorous formal definition, everything risks to go
in sloppy world (cf. Semantic Web's....). The major problem
therefore would be to start with some rigorous formal
mathematical definition, and then to proceed the discussion.
</premise>


So starting from the very basics:
"idempotence" doesn't have a precise definition; cf. rfc2616:

<quote>
Methods can also have the property of "idempotence" in that (aside
   from error or expiration issues) the side-effects of N > 0 identical
   requests is the same as for a single request.
</quote>

What is an "identical" request? If you start from an underlying model
where there is no time, this is trivially true for any method (!).
If you don't, then you'll see the problems you can run into...
The point is where is no web model all this is based on.
Just to cite a very easy sample of modeling, cf. from
http://www.scope.gmd.de/info/www6/technical/paper222/paper222.html :

<quote>
In general, we consider an (untimed) web structure to be a partial function from URLs 
to sequences of bytes. The intuition is that for each URL we can require from the web 
structure the corresponding object (an HTML page, a text file, etc.). The function has 
to be partial because for some URL there is no corresponding object. 
In this paper we consider as web structure the World Wide Web structure WWW. Note that 
in general the real WWW is a timed structure, since the URL mapping varies with time 
(i.e. it should be written as WWW_t for each time instant t). However, we will work under 
the hypothesis that the WWW is locally time consistent, i.e. that there is a non-void time 
interval I such that the probability that "WWW_t(url)=seq, t>=t'>=t+I implies WWW_t'(url)=seq"
is extremely high. That is to say, if at a certain time t an URL points to an object seq, 
then there is a time interval in which this property stays true. Note this doesn't mean 
that the web structure stays the same, since new web objects can be added. 
</quote>

So, yes, you could maybe define a notion of idempotence by using idempotence for
local intervals (modulo dynamic behaviour, but we're talking about 
principles here). But you see, to talk and reason about such things like idempotency in 
foundations, there have to be some kind of formal model, and to do this is far from trivial.
This impact the semantic web foundations as well: if you want to do reasoning on 
something (the WWW), you ought to have a formal definition for it at the very minimum.
I know this is something also DanC was advocating (and trying to grok with Larch, 
cf http://www.w3.org/XML/9711theory/ ), and those efforts were not just a toy
idea to try to formalize for the sake of doing so...
So let me restate, 
DESPITE this can be seen as a pain ;), and maybe not that exciting
this IS indeed of quite some fundamental importance for web architecture, and 
for applications that want to reason on web structure (like the semantic web indeed).


<timbl>
I have changed the paragraph to read:

"""The introduction of any other method apart from GET which has no
side-effects and is simply a function of the URI is also incorrect, because
the results of such an operation effectively form a separate address space,
which violates the universality."""
</timbl>
But this doesn't take into account e.g. OPTIONS, does it?
Anyway, let even OPTIONS be out of the discussion (you can always treat it as
an exception). A more crucial point is, this requirement is just too strict: 
look for example at HEAD (...).
Restating, I understand what you wanted to say Tim, but this is not the correct
notion: I *think* the correct notion you might want to consider (added to 
the premises of "which has no side-effects and is function of the URI", together
with a suitable underlying web model of course) is universality like defined in 
category theory (hah, see mathematicians don't invent words by chance... ;) : 
universality (or co-universality, depending on the initial web model you choose) 
of GET would imply, roughly speaking, that you can have any other method, provided 
this doesn't give you "more information" than the GET. 
So, HEAD is a perfectly legal method by this definition (but not under your 
stricter one), like also conditional GETs for example. 


Summing up, obviously it was not in the scope of previous works to define with formal 
rigour a foundational web model 
(cf. from http://www.w3.org/DesignIssues/Axioms.html :
"Words such as "axiom" and "theorem" are used with gay abandon and the reverse of rigour here." )
But if we want to play really serious, and start to do better, then a web formal
model should be designed: not *the* one, just one, or several ones, depending on the 
degree of precision you need (think e.g. of Semantic Web applications, where in all
likelihood, to start with, you can design just a simple web model to start with, 
cutting out most of HTTP's deeper functionalities).

-M
Received on Friday, 11 January 2002 12:42:31 UTC