Re: [Bug 11270] New: Interaction between in-line keys and key generators from Keean Schupke on 2010-11-12 (public-webapps@w3.org from October to December 2010)

From: Keean Schupke <keean@fry-it.com>
Date: Fri, 12 Nov 2010 08:27:20 +0000
To: Jeremy Orlow <jorlow@chromium.org>
Cc: Jonas Sicking <jonas@sicking.cc>, "Tab Atkins Jr." <jackalmage@gmail.com>, Pablo Castro <Pablo.Castro@microsoft.com>, "public-webapps@w3.org" <public-webapps@w3.org>
Message-ID: <AANLkTimyP99jLzJ1o_pwzMJGiQ0-m-fdA3WnwdAGwAMC@mail.gmail.com>
You can do it in SQL because tables that hold a reference to an ID can
declare the reference in the schema. I guess without the meta-data to do
this it cannot be done.

Why not get the auto-increment to wrap and skip collisions? What about
signed numbers?

Cheers,
Keean.

On 12 November 2010 08:23, Jeremy Orlow <jorlow@chromium.org> wrote:

> We can't compact because the developer may be expecting to look items up by
> ID with IDs in another table, on the server, in memory, etc.  There's no way
> to do it.
>
> J
>
>
> On Fri, Nov 12, 2010 at 10:56 AM, Keean Schupke <keean@fry-it.com> wrote:
>
>> The other thing you could do is specify that when you get a wrap (IE
>> someone inserts a key of MAXINT - 1) you auto-compact the table. If you
>> really have run out of indexes there is not a lot you can do.
>>
>> The other thing to consider it that because JS uses signed arithmetic, its
>> really a 63bit number... unless you want negative indexes appearing? (And
>> how would that affect ordering and sorting)?
>>
>>
>> Cheers,
>> Keean.
>>
>>
>> On 12 November 2010 07:36, Jeremy Orlow <jorlow@chromium.org> wrote:
>>
>>> On Fri, Nov 12, 2010 at 10:08 AM, Jonas Sicking <jonas@sicking.cc>wrote:
>>>
>>>> On Thu, Nov 11, 2010 at 9:22 PM, Jeremy Orlow <jorlow@chromium.org>
>>>> wrote:
>>>> > On Fri, Nov 12, 2010 at 12:32 AM, Jonas Sicking <jonas@sicking.cc>
>>>> wrote:
>>>> >>
>>>> >> On Thu, Nov 11, 2010 at 11:41 AM, Jeremy Orlow <jorlow@chromium.org>
>>>> >> wrote:
>>>> >> > On Thu, Nov 11, 2010 at 6:41 PM, Tab Atkins Jr. <
>>>> jackalmage@gmail.com>
>>>> >> > wrote:
>>>> >> >>
>>>> >> >> On Thu, Nov 11, 2010 at 4:20 AM, Jeremy Orlow <
>>>> jorlow@chromium.org>
>>>> >> >> wrote:
>>>> >> >> > What would we do if what they provided was not an integer?
>>>> >> >>
>>>> >> >> The behavior isn't very important; throwing would be fine here.
>>>>  In
>>>> >> >> mySQL, you can only put AUTO_INCREMENT on columns in the integer
>>>> >> >> family.
>>>> >> >>
>>>> >> >>
>>>> >> >> > What happens if
>>>> >> >> > the number they insert is so big that the next one causes
>>>> overflow?
>>>> >> >>
>>>> >> >> The same thing that happens if you do ++ on a variable holding a
>>>> >> >> number that's too large.  Or, more directly, the same thing that
>>>> >> >> happens if you somehow fill up a table to the integer limit
>>>> (probably
>>>> >> >> deleting rows along the way to free up space), and then try to add
>>>> a
>>>> >> >> new row.
>>>> >> >>
>>>> >> >>
>>>> >> >> > What is
>>>> >> >> > the use case for this?  Do we really think that most of the time
>>>> >> >> > users
>>>> >> >> > do
>>>> >> >> > this it'll be intentional and not just a mistake?
>>>> >> >>
>>>> >> >> A big one is importing some data into a live table.  Many smaller
>>>> ones
>>>> >> >> are related to implicit data constraints that exist in the
>>>> application
>>>> >> >> but aren't directly expressed in the table.  I've had several
>>>> times
>>>> >> >> when I could normally just rely on auto-numbering for something,
>>>> but
>>>> >> >> occasionally, due to other data I was inserting elsewhere, had to
>>>> >> >> specify a particular id.
>>>> >> >
>>>> >> > This assumes that your autonumbers aren't going to overlap and is
>>>> going
>>>> >> > to
>>>> >> > behave really badly when they do.
>>>> >> > Honestly, I don't care too much about this, but I'm skeptical we're
>>>> >> > doing
>>>> >> > the right thing here.
>>>> >>
>>>> >> Pablo did bring up a good use case, which is wanting to migrate
>>>> >> existing data to a new object store, for example with a new schema.
>>>> >> And every database examined so far has some ability to specify
>>>> >> autonumbered columns.
>>>> >>
>>>> >> overlaps aren't a problem in practice since 64bit integers are really
>>>> >> really big. So unless someone "maliciously" sets a number close to
>>>> the
>>>> >> upper bound of that then overlaps won't be a problem.
>>>> >
>>>> > Yes, but we'd need to spec this, implement it, and test it because
>>>> someone
>>>> > will try to do this maliciously.
>>>>
>>>> I'd say it's fine to treat the range of IDs as a hardware limitation.
>>>> I.e. similarly to how we don't specify how much data a webpage is
>>>> allowed to put into DOMStrings, at some point every implementation is
>>>> going to run out of memory and effectively limit it. In practice this
>>>> isn't a problem since the limit is high enough.
>>>>
>>>> Another would be to define that the ID is 64 bit and if you run out of
>>>> IDs no more rows can be inserted into the objectStore. At that point
>>>> the page is responsible for creating a new object store and compacting
>>>> down IDs. In practice no page will run into this limitation if they
>>>> use IDs increasing by one. Even if you generate a new ID a million
>>>> times a second, it'll still take you over half a million years to run
>>>> out of 64bit IDs.
>>>
>>>
>>> This seems reasonable.  OK, let's do it.
>>>
>>>
>>>>  > And, in the email you replied right under, I brought up the point
>>>> that this
>>>> > feature won't help someone who's trying to import data into a table
>>>> that
>>>> > already has data in it because some of it might clash.  So, just to
>>>> make
>>>> > sure we're all on the same page, the use case for this is restoring
>>>> data
>>>> > into an _empty_ object store, right?  (Because I don't think this is a
>>>> good
>>>> > solution for much else.)
>>>>
>>>> That's the main scenario I can think of that would require this yes.
>>>>
>>>> / Jonas
>>>>
>>>
>>>
>>
>
Received on Friday, 12 November 2010 08:27:58 UTC