Re: [Bug 11270] New: Interaction between in-line keys and key generators from Jeremy Orlow on 2010-11-12 (public-webapps@w3.org from October to December 2010)

From: Jeremy Orlow <jorlow@chromium.org>
Date: Fri, 12 Nov 2010 11:23:21 +0300
To: Keean Schupke <keean@fry-it.com>
Cc: Jonas Sicking <jonas@sicking.cc>, "Tab Atkins Jr." <jackalmage@gmail.com>, Pablo Castro <Pablo.Castro@microsoft.com>, "public-webapps@w3.org" <public-webapps@w3.org>
Message-ID: <AANLkTi=r3vv8TSooftrrp=cM9XoUQkGn6_hiLqbv4EWV@mail.gmail.com>

We can't compact because the developer may be expecting to look items up by
ID with IDs in another table, on the server, in memory, etc.  There's no way
to do it.

J

On Fri, Nov 12, 2010 at 10:56 AM, Keean Schupke <keean@fry-it.com> wrote:

> The other thing you could do is specify that when you get a wrap (IE
> someone inserts a key of MAXINT - 1) you auto-compact the table. If you
> really have run out of indexes there is not a lot you can do.
>
> The other thing to consider it that because JS uses signed arithmetic, its
> really a 63bit number... unless you want negative indexes appearing? (And
> how would that affect ordering and sorting)?
>
>
> Cheers,
> Keean.
>
>
> On 12 November 2010 07:36, Jeremy Orlow <jorlow@chromium.org> wrote:
>
>> On Fri, Nov 12, 2010 at 10:08 AM, Jonas Sicking <jonas@sicking.cc> wrote:
>>
>>> On Thu, Nov 11, 2010 at 9:22 PM, Jeremy Orlow <jorlow@chromium.org>
>>> wrote:
>>> > On Fri, Nov 12, 2010 at 12:32 AM, Jonas Sicking <jonas@sicking.cc>
>>> wrote:
>>> >>
>>> >> On Thu, Nov 11, 2010 at 11:41 AM, Jeremy Orlow <jorlow@chromium.org>
>>> >> wrote:
>>> >> > On Thu, Nov 11, 2010 at 6:41 PM, Tab Atkins Jr. <
>>> jackalmage@gmail.com>
>>> >> > wrote:
>>> >> >>
>>> >> >> On Thu, Nov 11, 2010 at 4:20 AM, Jeremy Orlow <jorlow@chromium.org
>>> >
>>> >> >> wrote:
>>> >> >> > What would we do if what they provided was not an integer?
>>> >> >>
>>> >> >> The behavior isn't very important; throwing would be fine here.  In
>>> >> >> mySQL, you can only put AUTO_INCREMENT on columns in the integer
>>> >> >> family.
>>> >> >>
>>> >> >>
>>> >> >> > What happens if
>>> >> >> > the number they insert is so big that the next one causes
>>> overflow?
>>> >> >>
>>> >> >> The same thing that happens if you do ++ on a variable holding a
>>> >> >> number that's too large.  Or, more directly, the same thing that
>>> >> >> happens if you somehow fill up a table to the integer limit
>>> (probably
>>> >> >> deleting rows along the way to free up space), and then try to add
>>> a
>>> >> >> new row.
>>> >> >>
>>> >> >>
>>> >> >> > What is
>>> >> >> > the use case for this?  Do we really think that most of the time
>>> >> >> > users
>>> >> >> > do
>>> >> >> > this it'll be intentional and not just a mistake?
>>> >> >>
>>> >> >> A big one is importing some data into a live table.  Many smaller
>>> ones
>>> >> >> are related to implicit data constraints that exist in the
>>> application
>>> >> >> but aren't directly expressed in the table.  I've had several times
>>> >> >> when I could normally just rely on auto-numbering for something,
>>> but
>>> >> >> occasionally, due to other data I was inserting elsewhere, had to
>>> >> >> specify a particular id.
>>> >> >
>>> >> > This assumes that your autonumbers aren't going to overlap and is
>>> going
>>> >> > to
>>> >> > behave really badly when they do.
>>> >> > Honestly, I don't care too much about this, but I'm skeptical we're
>>> >> > doing
>>> >> > the right thing here.
>>> >>
>>> >> Pablo did bring up a good use case, which is wanting to migrate
>>> >> existing data to a new object store, for example with a new schema.
>>> >> And every database examined so far has some ability to specify
>>> >> autonumbered columns.
>>> >>
>>> >> overlaps aren't a problem in practice since 64bit integers are really
>>> >> really big. So unless someone "maliciously" sets a number close to the
>>> >> upper bound of that then overlaps won't be a problem.
>>> >
>>> > Yes, but we'd need to spec this, implement it, and test it because
>>> someone
>>> > will try to do this maliciously.
>>>
>>> I'd say it's fine to treat the range of IDs as a hardware limitation.
>>> I.e. similarly to how we don't specify how much data a webpage is
>>> allowed to put into DOMStrings, at some point every implementation is
>>> going to run out of memory and effectively limit it. In practice this
>>> isn't a problem since the limit is high enough.
>>>
>>> Another would be to define that the ID is 64 bit and if you run out of
>>> IDs no more rows can be inserted into the objectStore. At that point
>>> the page is responsible for creating a new object store and compacting
>>> down IDs. In practice no page will run into this limitation if they
>>> use IDs increasing by one. Even if you generate a new ID a million
>>> times a second, it'll still take you over half a million years to run
>>> out of 64bit IDs.
>>
>>
>> This seems reasonable.  OK, let's do it.
>>
>>
>>>  > And, in the email you replied right under, I brought up the point that
>>> this
>>> > feature won't help someone who's trying to import data into a table
>>> that
>>> > already has data in it because some of it might clash.  So, just to
>>> make
>>> > sure we're all on the same page, the use case for this is restoring
>>> data
>>> > into an _empty_ object store, right?  (Because I don't think this is a
>>> good
>>> > solution for much else.)
>>>
>>> That's the main scenario I can think of that would require this yes.
>>>
>>> / Jonas
>>>
>>
>>
>

Received on Friday, 12 November 2010 08:24:14 UTC