Anne van Kesteren wrote: > On Mon, 21 Sep 2009 17:39:14 +0200, Per-Erik Brodin wrote: >> So what you are saying is that "\r\n" will always be a Windows line >> ending and never a Mac line ending followed by a Unix line ending? > > That's what should happen as that would be consistent with other text > formats, e.g. text/html. I guess this should be stated below the ABNF or > the ABNF should be rewritten to a more parser/state-like thingy. I'm envisioning a scenario where event stream data is aggregated from various sources, and done so improperly so that multiple different line endings end up in the stream. For example, appending a carriage return to a string that is already ending with carriage return produces a different result than appending a line feed to the same string. Since it's a new format being defined, why not make it clean and simple? Consider the following example: print "data: hello\r"; print "data: world\r"; print "\n"; # dispatch! >> Keep in mind that we are parsing a continuous stream where data arrives >> in chunks. It is entirely possible for a "\r\n" pair to be split up >> between two chunks which could be handled by either 1) dispatching an >> event immediately when receiving a carriage return and then upon >> reception of the next chunk "remember" that the last character in the >> previous chunk was a carriage return and discard the first character if >> it happens to be line feed, or 2) not dispatching an event until the >> next character after carriage return has been received which could lead >> to delays in event dispatch. Both these options are far from ideal. > > The first option should not be too hard to implement right? Just a > simple state variable in the tokenizer. > > My point was not that it would be particularly hard to implement. -- Per-Erik Brodin Ericsson ResearchReceived on Tuesday, 22 September 2009 09:08:37 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:43:17 GMT