- From: Pablo Castro <Pablo.Castro@microsoft.com>
- Date: Tue, 15 Feb 2011 07:38:41 +0000
- To: Jonas Sicking <jonas@sicking.cc>, Jeremy Orlow <jorlow@chromium.org>
- CC: Shawn Wilsher <sdwilsh@mozilla.com>, "public-webapps@w3.org" <public-webapps@w3.org>
(sorry for my random out-of-timing previous email on this thread. please see below for an actually up to date reply) -----Original Message----- From: Jonas Sicking [mailto:jonas@sicking.cc] Sent: Monday, February 07, 2011 3:31 PM On Mon, Feb 7, 2011 at 3:07 PM, Jeremy Orlow <jorlow@chromium.org> wrote: > On Mon, Feb 7, 2011 at 2:49 PM, Jonas Sicking <jonas@sicking.cc> wrote: >> >> On Sun, Feb 6, 2011 at 11:41 PM, Jeremy Orlow <jorlow@chromium.org> wrote: >> > On Sun, Feb 6, 2011 at 11:38 PM, Jonas Sicking <jonas@sicking.cc> wrote: >> >> >> >> On Sun, Feb 6, 2011 at 2:31 PM, Jeremy Orlow <jorlow@chromium.org> >> >> wrote: >> >> > On Sun, Feb 6, 2011 at 2:03 PM, Shawn Wilsher <sdwilsh@mozilla.com> >> >> > wrote: >> >> >> >> >> >> On 2/6/2011 12:42 PM, Jeremy Orlow wrote: >> >> >>> >> >> >>> My current thinking is that we should have some relatively large >> >> >>> limit....maybe on the order of 64k? It seems like it'd be very >> >> >>> difficult >> >> >>> to >> >> >>> hit such a limit with any sort of legitimate use case, and the >> >> >>> chances >> >> >>> of >> >> >>> some subtle data-dependent error would be much less. But a 1GB key >> >> >>> is >> >> >>> just >> >> >>> not going to work well in any implementation (if it doesn't simply >> >> >>> oom >> >> >>> the >> >> >>> process!). So despite what I said earlier, I guess I think we >> >> >>> should >> >> >>> have >> >> >>> some limit...but keep it an order of magnitude or two larger than >> >> >>> what >> >> >>> we >> >> >>> expect any legitimate usage to hit just to keep the system as >> >> >>> flexible >> >> >>> as >> >> >>> possible. >> >> >>> >> >> >>> Does that sound reasonable to people? >> >> >> >> >> >> Are we thinking about making this a MUST requirement, or a SHOULD? >> >> >> I'm >> >> >> hesitant to spec an exact size as a MUST given how technology has a >> >> >> way >> >> >> of >> >> >> changing in unexpected ways that makes old constraints obsolete. >> >> >> But >> >> >> then, >> >> >> I may just be overly concerned about this too. >> >> > >> >> > If we put a limit, it'd be a MUST for sure. Otherwise people would >> >> > develop >> >> > against one of the implementations that don't place a limit and then >> >> > their >> >> > app would break on the others. >> >> > The reason that I suggested 64K is that it seems outrageously big for >> >> > the >> >> > data types that we're looking at. But it's too small to do much with >> >> > base64 >> >> > encoding binary blobs into it or anything else like that that I could >> >> > see >> >> > becoming rather large. So it seems like a limit that'd avoid major >> >> > abuses >> >> > (where someone is probably approaching the problem wrong) but would >> >> > not >> >> > come >> >> > close to limiting any practical use I can imagine. >> >> > With our architecture in Chrome, we will probably need to have some >> >> > limit. >> >> > We haven't decided what that is yet, but since I remember others >> >> > saying >> >> > similar things when we talked about this at TPAC, it seems like it >> >> > might >> >> > be >> >> > best to standardize it--even though it does feel a bit dirty. >> >> >> >> One problem with putting a limit is that it basically forces >> >> implementations to use a specific encoding, or pay a hefty price. For >> >> example if we choose a 64K limit, is that of UTF8 data or of UTF16 >> >> data? If it is of UTF8 data, and the implementation uses something >> >> else to store the date, you risk having to convert the data just to >> >> measure the size. Possibly this would be different if we measured size >> >> using UTF16 as javascript more or less enforces that the source string >> >> is UTF16 which means that you can measure utf16 size on the cheap, >> >> even if the stored data uses a different format. >> > >> > That's a very good point. What's your suggestion then? Spec unlimited >> > storage and have non-normative text saying that >> > most implementations will >> > likely have some limit? Maybe we can at least spec a minimum limit in >> > terms >> > of a particular character encoding? (Implementations could translate >> > this >> > into the worst case size for their own native encoding and then ensure >> > their >> > limit is higher.) >> >> I'm fine with relying on UTF16 encoding size and specifying a 64K >> limit. Like Shawn points out, this API is fairly geared towards >> JavaScript anyway (and I personally don't think that's a bad thing). >> One thing that I just thought of is that even if implementations use >> other encodings, you can in the vast majority of cases do a worst-case >> estimate and easily see that the key that is used is below 64K. >> >> That said, does having a 64K limit really help anyone? In SQLite we >> can easily store vastly more than that, enough that we don't have to >> specify a limit. And my understanding is that in the Microsoft >> implementation, the limits for what they can store without resorting >> to various tricks, is much lower. So since that implementation will >> have to implement special handling of long keys anyway, is there a >> difference between saying a 64K limit vs. saying unlimited? > > As I explained earlier: "The reason that I suggested 64K is that it seems > outrageously big for the data types that we're looking at. But it's too > small to do much with base64 encoding binary blobs into it or anything else > like that that I could see becoming rather large. So it seems like a limit > that'd avoid major abuses (where someone is probably approaching the problem > wrong) but would not come close to limiting any practical use I can > imagine." > Since Chrome sandboxes the rendering process, if a web page allocates tons > of memory and OOMs the process, you just get a sad tab or two. But since > IndexedDB is partially in the browser process, I need to make sure a large > key is not going to OOM that (and thus crash the whole browser....something > a web page should never be able to do in Chrome). > Does FF and/or IE have any plans for similar limits? If so, I really think > we should coordinate. We don't have any plans for similar limits right now. Though of course if it's added to the spec we'd follow that. I don't really feel strongly on the issue as long as the limit is high enough (64K seems high enough, 2K does not) that non-malicious sites generally won't ever see the limit. I'm fine with imposing a limit mostly for predictability reasons. In practice I'm not sure this will help implementations a lot (don't know much about SQLite, but other databases tend have smaller page sizes and require non-blob data in records to fit in a single page). Even the OOM issue Jeremy was discussing could be applied to whole-records or other properties in records instead of keys, no? In the end, we could just put a number to encourage keys to be relatively small. The assumption for UTF-16 seems to be safe, and in any case is the safer assumption (i.e. if some implementation used something like UTF-8 then it'll just have margin to spare). -pablo
Received on Tuesday, 15 February 2011 07:39:15 UTC