Re: [EventSource] feedback from implementors from Ian Hickson on 2009-10-05 (public-webapps@w3.org from October to December 2009)

From: Ian Hickson <ian@hixie.ch>
Date: Mon, 5 Oct 2009 10:26:03 +0000 (UTC)
To: Per-Erik Brodin <per-erik.brodin@ericsson.com>, "Michael A. Puls II" <shadow2531@gmail.com>, Anne van Kesteren <annevk@opera.com>
Cc: public-webapps@w3.org
Message-ID: <Pine.LNX.4.62.0910050737410.25383@hixie.dreamhostps.com>
On Fri, 18 Sep 2009, Per-Erik Brodin wrote:
>
> When parsing an event stream, allowing carriage return, carriage return
> line feed, and line feed to denote line endings introduces unnecessary
> ambiguity into the spec. For example, the sequence "\r\r\n\n" could be
> interpreted as three or four line endings.

I've clarified the stream parsing format to define the above strictly as 
three line endings (not counting the end of the line).


> Since the event stream format isn't yet widely established, and I don't 
> see any compelling arguments why allowing multiple line endings would be 
> beneficial, I hope that it's not too late to change this.

Windows and Unix in particular have different default line endings, so I 
think we would be making authors' lives unnecessarily complicated if we 
required a particular kind of line-ending.


> Looking at the ABNF, I don't see why colon would not be allowed in 
> any-char but line feed would, so I think that COLON has mistakenly been 
> disallowed instead of LINE FEED.

Fixed. (Actually only the comment was wrong.)


> In the second example of how to interpret an event stream it is stated 
> that two empty data lines result in an event being dispatch with data 
> set to a single newline character. I don't see how this would be 
> possible given that a newline character should not be added if the data 
> buffer is empty.

Hm, good point. Fixed.


> It is explicitly stated that the URL attribute should return the 
> (absolute) URL that was passed to the constructor. Does that mean that 
> it should not change on permanent redirects where the actual URL that 
> the event source uses when reconnecting is changed? I'm fine with this 
> even though it will mean that we are keeping the original URL around 
> just for the sake of this attribute.

Yeah, it's just meant as a way to identify the EventSource objects (e.g. 
for debugging). Exposing redirected URLs would introduce race conditions 
and other complications.


> Another unclarity regards the onmessage attribute listener. Should that 
> trigger for all events of type MessageEvent or only for events that have 
> the event type set to "message"?

The spec explicitly says that 'onmessage' is an 'event handler' with the 
'corresponding event handler event type' 'message'; this unambiguously 
answers the above question. (You might have to look up the terms 'event 
handler' and 'event handler event type' in HTML5 to make much sense of 
this, admittedly.)


> When we did the implementation we interpreted the spec as stating the 
> latter, but that would mean you would have to use addEventListener when 
> the event field is used to set an event type/name other than "message".

Correct.


> Although it might be self-evident, the spec doesn't say that calling
> close should cancel a pending reconnect.

Fixed.


> Also, when calling close on an event source, the spec doesn't say 
> whether or not an error event should be dispatched, unlike the web 
> socket specification that says explicitly to fire an event. In my 
> opinion, an error event should not be dispatched since you may typically 
> call close from an error event listener in order to cancel a reconnect 
> in the case where the connection is reset, which would then result in a 
> second error event being dispatched.

No error event is dispatched, because it doesn't say to dispatch one. 
Similarly, no 'close' event is dispatched, no 'peanut' event is 
dispatched, and the disk doesn't get reformatted. :-)


> In the case of network errors, should the event source "fail the 
> connection" and not try to reconnect if you temporarily loose 
> connectivity?

I've tried to clarify that a network error that aborts a connection 
doesn't preclude reconnecting.


> When the event source ends up in the CLOSED state it is pretty much 
> useless and if you want the application to reconnect you would have to 
> create a new event source and register all event listeners again.

Indeed. Then again, if that happens, your EventSources breaking is likely 
the least of your troubles.


> Maybe it would be useful to have a reconnect/reopen method to enable an 
> application to reestablish the connection from a previously closed event 
> source?

This is explicitly not supported, because we don't want people doing this. 
If the connection ever gets closed, then the site likely has a problem, 
and we do _not_ want to encourage authors to just try to reconnect, since 
that is more likely to make the problem worse than anything else.


> Finally, it could be useful to be able to reset the reconnection time to 
> the user agent default value by sending the retry field only and leave 
> out the value similar to how you reset the last event id.

What's the use case?


On Mon, 21 Sep 2009, Per-Erik Brodin wrote:
>
> While regular expressions are greedy by default, I have been told that 
> there is no way to express such behavior using ABNF. For what it is 
> worth, that means that the current ABNF definition of the event stream 
> format can't stand on its own.

The ABNF definition only defines whether the stream is valid or not, it 
doesn't say how to parse it. For validity, the greediness doesn't matter.


> Keep in mind that we are parsing a continuous stream where data arrives 
> in chunks. It is entirely possible for a "\r\n" pair to be split up 
> between two chunks which could be handled by either 1) dispatching an 
> event immediately when receiving a carriage return and then upon 
> reception of the next chunk "remember" that the last character in the 
> previous chunk was a carriage return and discard the first character if 
> it happens to be line feed, or 2) not dispatching an event until the 
> next character after carriage return has been received which could lead 
> to delays in event dispatch. Both these options are far from ideal.

Both are conforming (and not really distinguishable from oddities of 
network traffic, in theory).


On Tue, 22 Sep 2009, Per-Erik Brodin wrote:
> 
> I'm envisioning a scenario where event stream data is aggregated from 
> various sources, and done so improperly so that multiple different line 
> endings end up in the stream. For example, appending a carriage return 
> to a string that is already ending with carriage return produces a 
> different result than appending a line feed to the same string.
> 
> Consider the following example:
> print "data: hello\r";
> print "data: world\r";
> print "\n";  # dispatch!

I would certainly encourage producers to standardise on a single 
line-ending convention.


> Since it's a new format being defined, why not make it clean and simple?

Because it wouldn't be simple for the half of the population dealing with 
a platform whose line endings don't match what we pick.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Monday, 5 October 2009 10:16:57 UTC