RE: Overlap between StreamReader and FileReader from Domenic Denicola on 2013-07-31 (public-webapps@w3.org from July to September 2013)

From: Domenic Denicola <domenic@domenicdenicola.com>
Date: Wed, 31 Jul 2013 17:17:49 +0000
To: Anne van Kesteren <annevk@annevk.nl>
CC: Jonas Sicking <jonas@sicking.cc>, Takeshi Yoshino <tyoshino@google.com>, Feras Moussa <feras.moussa@hotmail.com>, Travis Leithead <travis.leithead@microsoft.com>, Alex Russell <slightlyoff@google.com>, "Web Applications Working Group WG (public-webapps@w3.org)" <public-webapps@w3.org>, "i@izs.me" <i@izs.me>
Message-ID: <B4AE8F4E86E26C47AC407D49872F6F9F87893F1C@BY2PRD0510MB354.namprd05.prod.outlook.>

From: Anne van Kesteren [annevk@annevk.nl]

> It seems though that if you can change the way bytes are consumed while reading a stream you will end up with problematic scenarios. E.g. you consume 2 bytes of a 4-byte utf-8 sequence. Then switch to reading code points... Instantiating a ByteStream or TextStream in advance would address that.

Yes, and I think I would actually prefer such an API honestly. But IIRC Jonas earlier wanted to be able to do both binary and text in the same stream (did he have a specific use case?), and presumably that motivated Node's design as well.

I guess you can just say that if you're in binary mode, you should know what you're doing, and know precisely when is the correct time to switch to string mode. If you switch in the middle of a four-byte sequence, you presumably meant to do so, and deserve to get back the mangled characters that result.

To make this work might require some kind of "put the bytes back" primitive, to avoid a situation where you read "too far" in binary mode and want to back up a bit before you engage string mode. I guess this is Node.js's [unshift][1].

It would be cool to avoid all this though and just read either bytes or strings, without allowing switching. (Maybe, feed the byte stream into a string decoder transform, and get back a string stream?)

[1]: http://nodejs.org/api/stream.html#stream_readable_unshift_chunk

Received on Wednesday, 31 July 2013 17:18:25 UTC