W3C home > Mailing lists > Public > public-webapps@w3.org > October to December 2010

Re: [Bug 11270] New: Interaction between in-line keys and key generators

From: Jeremy Orlow <jorlow@chromium.org>
Date: Fri, 12 Nov 2010 11:36:40 +0300
Message-ID: <AANLkTineTdTZkHyV1oFGKHLed5eHDnfh5-KNptAk1GLg@mail.gmail.com>
To: Keean Schupke <keean@fry-it.com>
Cc: Jonas Sicking <jonas@sicking.cc>, "Tab Atkins Jr." <jackalmage@gmail.com>, Pablo Castro <Pablo.Castro@microsoft.com>, "public-webapps@w3.org" <public-webapps@w3.org>
On Fri, Nov 12, 2010 at 11:27 AM, Keean Schupke <keean@fry-it.com> wrote:

> You can do it in SQL because tables that hold a reference to an ID can
> declare the reference in the schema. I guess without the meta-data to do
> this it cannot be done.


Even in SQL, I'd be very hesitant to do this.


> Why not get the auto-increment to wrap and skip collisions? What about
> signed numbers?
>

Exactly.  If we're going to support this, let's keep it super simple.  As
Jonas mentioned, it's very unlikely that anyone would hit the 64bit limit in
legitimate usage, so it's not worth trying to gracefully handle such a
situation and adding a lot of surface area.


> Cheers,
> Keean.
>
>
> On 12 November 2010 08:23, Jeremy Orlow <jorlow@chromium.org> wrote:
>
>> We can't compact because the developer may be expecting to look items up
>> by ID with IDs in another table, on the server, in memory, etc.  There's no
>> way to do it.
>>
>> J
>>
>>
>> On Fri, Nov 12, 2010 at 10:56 AM, Keean Schupke <keean@fry-it.com> wrote:
>>
>>> The other thing you could do is specify that when you get a wrap (IE
>>> someone inserts a key of MAXINT - 1) you auto-compact the table. If you
>>> really have run out of indexes there is not a lot you can do.
>>>
>>> The other thing to consider it that because JS uses signed arithmetic,
>>> its really a 63bit number... unless you want negative indexes appearing?
>>> (And how would that affect ordering and sorting)?
>>>
>>>
>>> Cheers,
>>> Keean.
>>>
>>>
>>> On 12 November 2010 07:36, Jeremy Orlow <jorlow@chromium.org> wrote:
>>>
>>>> On Fri, Nov 12, 2010 at 10:08 AM, Jonas Sicking <jonas@sicking.cc>wrote:
>>>>
>>>>> On Thu, Nov 11, 2010 at 9:22 PM, Jeremy Orlow <jorlow@chromium.org>
>>>>> wrote:
>>>>> > On Fri, Nov 12, 2010 at 12:32 AM, Jonas Sicking <jonas@sicking.cc>
>>>>> wrote:
>>>>> >>
>>>>> >> On Thu, Nov 11, 2010 at 11:41 AM, Jeremy Orlow <jorlow@chromium.org
>>>>> >
>>>>> >> wrote:
>>>>> >> > On Thu, Nov 11, 2010 at 6:41 PM, Tab Atkins Jr. <
>>>>> jackalmage@gmail.com>
>>>>> >> > wrote:
>>>>> >> >>
>>>>> >> >> On Thu, Nov 11, 2010 at 4:20 AM, Jeremy Orlow <
>>>>> jorlow@chromium.org>
>>>>> >> >> wrote:
>>>>> >> >> > What would we do if what they provided was not an integer?
>>>>> >> >>
>>>>> >> >> The behavior isn't very important; throwing would be fine here.
>>>>>  In
>>>>> >> >> mySQL, you can only put AUTO_INCREMENT on columns in the integer
>>>>> >> >> family.
>>>>> >> >>
>>>>> >> >>
>>>>> >> >> > What happens if
>>>>> >> >> > the number they insert is so big that the next one causes
>>>>> overflow?
>>>>> >> >>
>>>>> >> >> The same thing that happens if you do ++ on a variable holding a
>>>>> >> >> number that's too large.  Or, more directly, the same thing that
>>>>> >> >> happens if you somehow fill up a table to the integer limit
>>>>> (probably
>>>>> >> >> deleting rows along the way to free up space), and then try to
>>>>> add a
>>>>> >> >> new row.
>>>>> >> >>
>>>>> >> >>
>>>>> >> >> > What is
>>>>> >> >> > the use case for this?  Do we really think that most of the
>>>>> time
>>>>> >> >> > users
>>>>> >> >> > do
>>>>> >> >> > this it'll be intentional and not just a mistake?
>>>>> >> >>
>>>>> >> >> A big one is importing some data into a live table.  Many smaller
>>>>> ones
>>>>> >> >> are related to implicit data constraints that exist in the
>>>>> application
>>>>> >> >> but aren't directly expressed in the table.  I've had several
>>>>> times
>>>>> >> >> when I could normally just rely on auto-numbering for something,
>>>>> but
>>>>> >> >> occasionally, due to other data I was inserting elsewhere, had to
>>>>> >> >> specify a particular id.
>>>>> >> >
>>>>> >> > This assumes that your autonumbers aren't going to overlap and is
>>>>> going
>>>>> >> > to
>>>>> >> > behave really badly when they do.
>>>>> >> > Honestly, I don't care too much about this, but I'm skeptical
>>>>> we're
>>>>> >> > doing
>>>>> >> > the right thing here.
>>>>> >>
>>>>> >> Pablo did bring up a good use case, which is wanting to migrate
>>>>> >> existing data to a new object store, for example with a new schema.
>>>>> >> And every database examined so far has some ability to specify
>>>>> >> autonumbered columns.
>>>>> >>
>>>>> >> overlaps aren't a problem in practice since 64bit integers are
>>>>> really
>>>>> >> really big. So unless someone "maliciously" sets a number close to
>>>>> the
>>>>> >> upper bound of that then overlaps won't be a problem.
>>>>> >
>>>>> > Yes, but we'd need to spec this, implement it, and test it because
>>>>> someone
>>>>> > will try to do this maliciously.
>>>>>
>>>>> I'd say it's fine to treat the range of IDs as a hardware limitation.
>>>>> I.e. similarly to how we don't specify how much data a webpage is
>>>>> allowed to put into DOMStrings, at some point every implementation is
>>>>> going to run out of memory and effectively limit it. In practice this
>>>>> isn't a problem since the limit is high enough.
>>>>>
>>>>> Another would be to define that the ID is 64 bit and if you run out of
>>>>> IDs no more rows can be inserted into the objectStore. At that point
>>>>> the page is responsible for creating a new object store and compacting
>>>>> down IDs. In practice no page will run into this limitation if they
>>>>> use IDs increasing by one. Even if you generate a new ID a million
>>>>> times a second, it'll still take you over half a million years to run
>>>>> out of 64bit IDs.
>>>>
>>>>
>>>> This seems reasonable.  OK, let's do it.
>>>>
>>>>
>>>>>  > And, in the email you replied right under, I brought up the point
>>>>> that this
>>>>> > feature won't help someone who's trying to import data into a table
>>>>> that
>>>>> > already has data in it because some of it might clash.  So, just to
>>>>> make
>>>>> > sure we're all on the same page, the use case for this is restoring
>>>>> data
>>>>> > into an _empty_ object store, right?  (Because I don't think this is
>>>>> a good
>>>>> > solution for much else.)
>>>>>
>>>>> That's the main scenario I can think of that would require this yes.
>>>>>
>>>>> / Jonas
>>>>>
>>>>
>>>>
>>>
>>
>
Received on Friday, 12 November 2010 08:37:33 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:49:41 GMT