- From: Jonas Sicking <jonas@sicking.cc>
- Date: Tue, 20 Aug 2013 00:48:55 -0700
- To: Janusz Majnert <j.majnert@samsung.com>
- Cc: public-script-coord@w3.org
- Message-ID: <CA+c2ei_6E5+_6-NEV7uxJ6hqm2vR5FwCR2VB98pVksBF3h_vvQ@mail.gmail.com>
On Aug 19, 2013 11:40 PM, "Janusz Majnert" <j.majnert@samsung.com> wrote: > On 2013-08-19 18:37, Jonas Sicking wrote: >> We still wouldn't be able to let offsets represent an offset in >> characters since the only way to know where in the file the 10th >> character is located is to read the whole file from the start. This >> performs very poorly once you try to read from the millionth character >> in a file. > > Yes, we would be able to let offset be expressed in characters. Yes, you would have to read the file from the beginning to get to the 10th character. Would this really perform poorly? Yes, it would perform very poorly if you set the offset to 1000000 and do a read-modify-rewind-write. Implementations would be forced to keep complex caches to remember which text offsets map to which byte offsets as to not have to read from the beginning of the file over and over. And do complex logic for when to invalidate/update those caches as the file is being modified. All of this while the file is open. Caching between file opens would likely not be doable at all. > Are you saying that with TextEncoder/TextDecoder you don't have to read the file from the beginning, or that it somehow performs better? The difference is that operations that are expensive should look expensive. Simply setting the offset to 1000000 and reading one character does not make it obvious that 1MB of IO is happening. If we force authors to do all of that IO and converting themselves it is clear that it is an expensive operation. And applications would then be encouraged to do their own offset caching if needed. > IMHO, if you want to have a text-mode read() function, you need to accept that the performance will be worse than with plain read(), which is not a bad thing considering what that function would actually do. It is OK that text reading is a few percent slower. Or even twice as slow. It is not OK if it is several orders of magnitude slower because it requires the whole file to be read. >> I think my recommendation is to keep text support out of the spec for >> now and instead rely on TextEncoder/TextDecoder. We can always add >> text handling later or even in v2. > > Fine with me Cool. / Jonas
Received on Tuesday, 20 August 2013 07:49:23 UTC