- From: Janusz Majnert <j.majnert@samsung.com>
- Date: Mon, 19 Aug 2013 10:00:59 +0200
- To: public-script-coord@w3.org
On 2013-08-16 20:50, Jonas Sicking wrote: > On Fri, Aug 16, 2013 at 3:13 AM, Janusz Majnert <j.majnert@samsung.com> wrote: [cut] >> b) If I open a text file using some multi-byte encoding and call >> readText(2), will that increment the offset attribute by 2 or by the actual >> amount of bytes read? Note that incrementing by amount of bytes might not be >> possible before doing IO. >> >> c) If I open a text file using some multi-byte encoding then mix calls to >> read() and readText()? Or if I first set offset to some arbitrary value, >> that just happens to be not aligned with the code-point boundary and call >> readText()? > > It's unclear if readText will make it into the first version. We > should probably get agreement on binary data handling before adding > text data to the mix. > > That said, my thinking was that readText operates on byte ranges. I.e. > the size passed to readText is not how many characters to read, but > rather how many bytes to read. That means that .readText(2) always > increases .offset by 2, but you won't always get back a string which > is 2 characters long. > > This matches how Blob and FileReader does text handling. > IMHO I would expect that by calling readText(5) I will read 5 characters... Have you considered specifying the "Text" mode with openRead/openWrite ? For example, you could have: Promise<FileHandle> openRead((DOMString or File) path, optional DOMString textEncoding); Promise<FileHandleWritable> openWrite((DOMString or File) path, OpenWriteOptions options, optional DOMString textEncoding); Where textEncoding means: - undefined - don't open in "text" mode - valid and supported encoding name - open in text mode and use this encoding - null or unsupported encoding name - open in text mode and autodetect encoding There would be no need to have readText(), and offset would be expressed not in bytes but in actual characters/code-points, ie handle.read(2) would read 2 characters, nevermind the encoding used. -- Janusz Majnert Samsung R&D Institute Poland Samsung Electronics
Received on Monday, 19 August 2013 08:01:47 UTC