W3C home > Mailing lists > Public > public-webapps@w3.org > July to September 2009

Re: [EventSource] feedback from implementors

From: Per-Erik Brodin <per-erik.brodin@ericsson.com>
Date: Mon, 21 Sep 2009 17:39:14 +0200
Message-ID: <4AB79E22.30004@ericsson.com>
To: "Michael A. Puls II" <shadow2531@gmail.com>
CC: public-webapps@w3.org
Michael A. Puls II wrote:
> On Fri, 18 Sep 2009 11:37:24 -0400, Per-Erik Brodin wrote:
> 
>> When parsing an event stream, allowing carriage return, carriage return
>> line feed, and line feed to denote line endings introduces unnecessary
>> ambiguity into the spec. For example, the sequence "\r\r\n\n" could be
>> interpreted as three or four line endings.
> 
> That would always be 3 lines: a mac, a windows and a nix. "\n\r\n\r" 
> would be the reverse order, but still 3.
So what you are saying is that "\r\n" will always be a Windows line
ending and never a Mac line ending followed by a Unix line ending?

> 
> Universal newline normalization for input with mixed newline formats:
> 
> // normalize newlines to \n
> .replace(/\r\n|\r/g, "\n");
> 
> // normalize newlines to \r\n
> .replace(/\r\n|r|\n/g, "\r\n");
> 
> // normalize newlines to \r
> .replace(/\r\n|\n/g, "\r");
While regular expressions are greedy by default, I have been told that
there is no way to express such behavior using ABNF. For what it is
worth, that means that the current ABNF definition of the event stream
format can't stand on its own.

> 
> Ideally, I think it's often best to do the first to normalize to \n for 
> processing (like if you need to know line count) and then normalize to a 
> different format *if needed* afterwards.
> 
> IMO
> 
Keep in mind that we are parsing a continuous stream where data arrives
in chunks. It is entirely possible for a "\r\n" pair to be split up
between two chunks which could be handled by either 1) dispatching an
event immediately when receiving a carriage return and then upon
reception of the next chunk "remember" that the last character in the
previous chunk was a carriage return and discard the first character if
it happens to be line feed, or 2) not dispatching an event until the
next character after carriage return has been received which could lead
to delays in event dispatch. Both these options are far from ideal.

--
Per-Erik Brodin
Ericsson Research
Received on Monday, 21 September 2009 15:41:08 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:49:33 GMT