Re: Reliable HTTP from Dan Weinreb on 2001-07-20 (www-ws@w3.org from July 2001)

From: Dan Weinreb <dlw@exceloncorp.com>
Date: Fri, 20 Jul 2001 00:59:50 -0400 (EDT)
To: rpk@watson.ibm.com
CC: www-ws@w3.org
Message-Id: <200107200459.AAA27962@handcuff.odi.com>
I've read the HTTPR protocol spec.  It's very interesting.  I have
designed a protocol, which we're using in one of our products, which
is a sort of baby version of this, without the sophistication of being
able to start sending data before the sender knows how much he will
send, without the chunking, etc.  Also, my protocol only does pushing,
not pulling, and always operates with one channel in each direction,
in the manner you described in your mail of Wednesday.  But the
reliability issues seemed very familiar to me.

In the protocol spec, section 1.4, in the second paragraph after
figure 2, you talk about "indicating at the end of the payload whether
the sending agent detected some error condition during the
transmission...".  I was curious what kind of error conditions you
might have in mind here.  Presumably you don't mean a malfunction
inside the message-sending agent itself.  At least, in order for this
mechanism to work, the stream of bytes within the HTTP body needs to
be sufficiently coherent that the payload terminator can be recognized
as such; that is, the application message structures have to be
conformant, since the receiving agent has to "parse" through them in
order to know when one of them ends so that it can properly recognize
the payload terminator.

A very small point: I think section 6.2.9 ("Encoding") could stand a
bit more clarification.  It wasn't clear to me exactly what "reversed"
means.  More broadly, it wasn't clear why this kind of information is
useful when the payload contents are generally considered opaque.
What good does it do me to know which floating point format is being
used, if I don't have any way to know which bytes of the MessageData
represent floating point numbers in the first place?  Perhaps the idea
is that these settings would be conveyed to a higher level of
abstraction that knows the semantics of the MessageData?  Still, it
seems rather arbitrary to have HTTPR deal in this level of
interpretation.  And if you're going to talk about how numeric types
are encoded, why not also talk about character encodings?  And why
stop there?

On another topic, the motivation for building upon HTTP is that
firewalls are generally configured to allow HTTP traffic and deny most
other traffic.  This general point could be made about nearly any
proposed new Internet protocol: they all ought to be built on top of
HTTP, for the same reason.

If this reasoning were followed to its logical conclusion (and
compatibility with past protocols were not an issue), we'd be able to
retire all well-known ports other than 80 and 443, and the logic in
the firewalls that denies connections on those other ports will lie
dormant since nobody would use those ports for anything.  Isn't there
something peculiar about this?  (I realize that there are many
practical reasons why this would not literally happen.)

In particular, as I understand it, SOAP is also built on top of HTTP,
and articles I have read about SOAP stress that one of its virtues is
that it can work through firewalls because firewalls generally allow
HTTP to pass through.

But why do firewalls allow HTTP in the first place?  Presumably the
original justification was that HTTP requests are reasonably "safe".
There must have been some feeling that allowing outsiders to send HTTP
requests to your HTTP server doesn't put you at great peril, as
compared to many well-known ports that firewalls generally deny.  (The
frequency of the discovery of security problems with Microsoft IIS
tends to challenge this judgement, but let's ignore that for now.)

But once the HTTP server has been enhanced to deal with all kinds of
interesting protocols that are built upon HTTP, and especially once it
has been enhanced to provide a very general and powerful RPC mechanism
such as SOAP, it's less clear that it's so safe to expose it.  Indeed,
once system security personnel become aware that the HTTP server is
being used as a general RPC server that can run all kinds of programs
on behalf of the client, port 80 might not seem so safe any more.  In
a way, using HTTP for all kinds of other purposes almost seems like a
way of thwarting or subverting one's own security policies.  Surely the
people in charge of the firewalls will "catch on" eventually?

-- Dan
Received on Friday, 20 July 2001 01:00:34 UTC