[EventSource] Comments to the current draft

Hi all,

Here are some comments / questions for clarification re the current
Server Sent Events draft
(http://www.w3.org/TR/2009/WD-eventsource-20091029/). If there is time
next week, and these points are not addressed by email before then, I
would appreciate the opportunity to have a F2F discussion on the open
items in the Webapps API's F2F meeting.

Re "The API is designed such that it can be extended to work with other
push notification schemes such as Push SMS.": what is meant by "Push
SMS"? Does this refer to OMA Push, i.e. the service enabler defined by
the Open Mobile Alliance (OMA)? As I am the chair of the OMA Content
Delivery (CD) working group, and as convener and a key contributor to
the work on OMA Push, I would like to discuss the potential synergy
between Server Sent Events and OMA Push, as complementary mechanisms in
an overall server sent events framework. But first I need to be sure of
what the "Push SMS" term refers to.

Re "User agents must act as if the connection had failed due to a
network error if the origin of the URL of the resource to be fetched is
not the same origin as that of the first script  when the EventSource()
constructor is invoked.": what this means is unclear (at least to me).
Is this essentially referring to the same-domain restriction, i.e. an
eventsource() domain must be the same as the script source domain? What
is "the first script" (and does this imply that only one eventsource()
call is possible for a webapp)?.

Re "If such a resource (with the correct MIME type) completes loading
(i.e. either the entire HTTP response body is received, or the
connection is closed somehow, whether by the server or by a network
error), the user agent must reset the connection.": this would seem to
say after each response, the connection must be reset ("the entire HTTP
response body is received"). This seems inefficient, if dropping the
HTTP connection is what is meant. Not only could the data connection be
then dropped, but at least the eventual HTTP connection reestablishment
is a significant expense, resulting in a high overhead of TCP chatter.
It would seem that keeping the connection alive and letting the server
decide when to drop the connection would be a better approach.

Re "HTTP 200 OK responses that have a Content-Type other than
text/event-stream (or some other supported type)": "other supported
type" I suppose means some arbitrary MIME type supported by the user
agent. Are there any practical limitations on what MIME types can be
supported?

Re "HTTP 204 No Content, and 205 Reset Content responses are equivalent
to 200 OK responses with the right MIME type but no content, and thus
must reset the connection.": same comment as above, seems very
inefficient if what's meant is dropping the HTTP connection.

Re "When a user agent is to reset the connection, the user agent must
set the readyState attribute to CONNECTING, queue a task to fire a
simple event named error at the EventSource object,... ": firing an
error is strange given that to "reset the connection" is a normal step
of event handling as described above when "the entire HTTP response body
is received" or "HTTP 204 No Content" is received.

Re "The task source for any tasks that are queued by EventSource objects
is the remote event task source.": can this statement be explained (with
references as needed)? What is the "remote event task source"?

Re "This event stream format's MIME type is text/event-stream.": Does
this imply that other MIME types cannot be used as event streams?

Re "Since connections established to remote servers for such resources
are expected to be long-lived": this seems to conflict with the "reset
the connection" functions above. How can the connection be long-lived if
it is regularly reset by the user-agent?
Re "If the event name buffer is not the empty string but is also not a
valid event type name, as defined by the DOM Events specification": what
a "valid event type name, as defined by the DOM Events specification"
is, is unclear. The DOM Events specification does not define what a
valid event type name is, as far as I can tell. There is reference to
"Application-specific event types should use a prefix string on the
event type name to avoid clashes with future general-purpose event
types." but no more specific guidance I can find. Please clarify.

Re "Otherwise, create an event that uses the MessageEvent interface,
with the event name message, which does not bubble, is not cancelable,
and has no default action.": sorry for the newbie type question, but
readers of this spec are likely to need the same guidance: please create
a link to what is meant by "bubble" etc. I'm sure this is some obvious
concept to browser designers, but is likely unknown to many spec
readers.

Re "Legacy proxy servers are known to, in certain cases, drop HTTP
connections after a short timeout. To protect against such proxy
servers, authors can include a comment line (one starting with a ':'
character) every 15 seconds or so.": Note the impact of long-lived
connections on proxy servers is very significant, especially for proxy
servers as typically deployed in mobile networks. Given that there is
value being provided by maintaining the connection (i.e. regular data
flow across the connection) then the impact is just part of providing
the data service - but developers of mobile webapps that use eventsource
should be encouraged to drop connections during low-event periods and
use connectionless Push methods (via an OMA Push server) to deliver the
events.

Re "Implementations that support HTTP's per-server connection limitation
might run into trouble when opening multiple pages from a site if each
page has an EventSource to the same domain. Authors can avoid this using
the relatively complex mechanism of using unique domain names per
connection, or by allowing the user to enable or disable the EventSource
functionality on a per-page basis, or by sharing a single EventSource
object using a shared worker. [WEBWORKERS]": this whole paragraph
deserves more discussion. Does "Implementations that support HTTP's
per-server connection limitation" refer to the *client* implementation
or the *server* implementation? "Unique domain names per connection" is
unclear and does sound complex and non-scalable. "Allowing the user to
enable..." is also likely an experience-killer. The shared worker
concept seems to have some merit and should be further explained, e.g.
how would multiple eventsource requests (at the "boss" thread layer) get
passed through a worker thread and managed re events coming back? Is
this something that Javascript frameworks are expected to develop
support for?

Re "Such formats could include systems like SMS-push; for example
servers could use Accept headers and HTTP redirects to an SMS-push
mechanism as a kind of protocol negotiation to reduce network load in
GSM environments.": This statement should be clarified. What's meant by
"use Accept headers and HTTP redirects to an SMS-push mechanism " is
unclear. If the client is offline, webapp servers can use OMA Push API's
(including the OMA Push Access Protocol) to initiate delivery of Push
messages to the user agent, including the text/event-stream MIME type
(since OMA Push supports any MIME type). The latest version of the OMA
Push specification (OMA Push 2.2) defines an abstract interface via
which web applications can register for OMA Push events, and this
interface can be supported through a Javascript-callable API (in
development).

Overall, the approach to the Server Sent Events API seems to rely
heavily on two methods which have significant consequences in mobile
networks. This may be what was referred to as "network load in GSM
environments", but would benefit from being clarified in text as a more
clear set of motivations to use alternate Push means (e.g. OMA Push)
where available. The two methods are (a) persistent HTTP connections;
(b) polling via intermittently re-established data connections and HTTP
connections.

Here are some bullet points illustrating the issues with these methods.

The Challenges of Persistent Connection Based Push Methods

- In mobile environments, long-lived connections and/or polling (which
approximates the impact of a persistent connection, in different ways)
have impacts on:
	- Device resources
	- Mobile network resources

If the value of the service/content is high enough to the user, these
impacts are just part of providing the service. Otherwise they just
represent inefficient resource utilization, possibly driven by lack of
developer awareness on best practices for mobile apps. Also, users have
a tendency to install applications, try them, and leave them
installed/operating while using them less and less. With multiple
applications installed, each obtaining content from a different source,
the network impact is also multiplied, e.g. a Facebook app + Twitter app
+ other infotainment app. Smartphones especially can be expected to
enable such cases of multiple simultaneously running applications.

Persistent Connection Challenges: Device Impacts

- Battery: 
	- Long-lived data connections with keep-alives have a
substantial impact on battery drain
- Data usage:
	- Background polling can have a substantial impact on service
cost to the user or Operator (for flat-rate plans)
	- Short bursty data traffic can affect radio bearer efficiency
	- Polling schedules may become synchronized, causing substantial
fluctuations in radio and IP network traffic

Persistent Connection Challenges: Network Impacts

- Long-lived connections put substantial pressure on IPV4 address space
in mobile networks
	- This puts additional pressure on the need to deploy IPV6 - a
significant impact on mobile networks
- Side effects of the high consumption of private IPV4 addresses in
mobile networks
	- DHCP lease time must be kept short, to terminate idle
connections. This conflicts with Push models dependent upon persistent
connections and intermittent keep-alive messages.
	- A very high volume of RADIUS Accounting Start/Stop messages
that systems must consume
- Public IPV4 space is also significantly impacted
	- Requires deployment of Network Address and Port Translation
(NAPT) edge routers
- Radio resource capacity is impacted by the number of active data
connections
- IP Core systems capacity is impacted by the number of active data and
TCP connections
	- SGSN, GGSN, routers, Policy Control systems, Firewalls, NAT
- HTTP Proxy/Gateways are particularly impacted by long-lived
connections
	- Proxy upgrade is a consistent race to keep ahead of browser
and HTTP-based application usage
	- Persistent HTTP connections rapidly increase capacity issues

There are various proprietary persistent-connection Push protocols used
in mobile networks today, some with progressive backoff schemes or other
optimizations. The applications served by them are typically associated
with high-value data services/plans, so the resource impact of
maintaining the connections is just part of the service overhead. But
similar approaches don't have the same ROI attraction for lower-value
services and especially for the "long-tail" developer, who may be less
aware/concerned about the impact these content delivery methods have on
the user or service provider. Before W3C promotes such methods as a
"standard" intended for broad use in web applications, it should ensure
the caveats of using these methods are clear, especially in service
environments (e.g. mobile networks) where the resource implications are
more significant. At the least, best practices for use of Server Sent
Events should be developed to guide developers in effective use of this
API.

Best regards,
Bryan Sullivan | AT&T

Received on Wednesday, 28 October 2009 06:43:29 UTC