- From: Aryeh Gregor <Simetrical+w3c@gmail.com>
- Date: Wed, 4 May 2011 19:33:33 -0400
- To: Jonas Sicking <jonas@sicking.cc>, Keean Schupke <keean@fry-it.com>
- Cc: Pablo Castro <Pablo.Castro@microsoft.com>, "public-webapps@w3.org" <public-webapps@w3.org>
On Tue, May 3, 2011 at 7:57 PM, Jonas Sicking <jonas@sicking.cc> wrote: > I don't think we should do callbacks for the first version of > javascript. It gets very messy since we can't rely on that the script > function will be returning stable values. The worst that would happen if it didn't return stable values is that sorting would return unpredictable results. > So the choice here really is between only supporting some form of > binary sorting, or supporting a built-in set of collations. Anything > else will have to wait for version 2 in my opinion. I think it would be a mistake to try supporting a limited set of natural-language collations. Binary collation is fine for a first version. MySQL only supported binary collation up through version 4, for instance. On Wed, May 4, 2011 at 3:49 AM, Keean Schupke <keean@fry-it.com> wrote: > I thought only the app that created the db could open it (for security > reasons)... so it becomes the app's responsibility to do version control. > The comparison function is not going to change by itself - someone has to go > into the code and change it, when they do that they should up the revision > of the database, if that change is incompatible. Why should we let such a pitfall exist if we can just store the function and avoid the issue? > There is exactly the same problem with object properties. If the app changes > to expect a new property on all objects stored, then the app has to > correctly deal with the update. If a requested property doesn't exist, I assume the API will fail immediately with a clear error code. It will not fail silently and mysteriously with no error code. (Again, I haven't looked at it closely, or tried to use it.) > 2) making things easy for the user - for me a simpler more predictable API > is better for the user. Having a function stored inside the database is bad, > because you cannot see what function might be stored in there... We could let you query the stored function. > it might be > a function from a previous version of the code and cause all sorts of > strange bugs (which will only affect certain users with a certain version of > the function stored in their DB). It will cause *much* less strange bugs than if you have one index that used two different collations, which is the alternative possibility. If the function is stored, the worst case will be that the collation function is out of date. In practice, authors will mostly want to use established collation functions like UCA and won't mind if they're out of date. They'll also only very rarely have occasion to deliberately change the function. On Wed, May 4, 2011 at 4:01 PM, Jonas Sicking <jonas@sicking.cc> wrote: > Browsers can certainly deal with this, and ensure that the only one > suffering is the author of the buggy algorithm. However this comes at > a cost in that the browser sorting algorithm can't go into infinite > loops or crash even in the face of the most ridiculous comparison > algorithm. In other words, the browser will likely have to use a > slower sorting implementation in order to be robust. The browser will only run the function once every time the given field changes, and change the value used in the index if it's different from the current one. The actual sorting will still be binary, just with a user-provided key. So there's no possibility of especially bad effects if you're given a bad function. You're only running it once per value, so it's no worse than any other function that's run a bunch of times. We aren't talking about a sort()-style comparison function that returns -1 or 0 or 1. We're talking about a function that takes a string as input, and outputs a string to be used in the index as the key for the object in question. I guess you *could* also do it as a comparison function too -- would probably be easier to write, but also a lot easier to get badly wrong, and you'd have to do a bunch of function calls on insert or update instead of just one. > Additionally, there is a significant cost involved in transitioning > between the C++ code implementing the sorting algorithm, and the > javascript implemented callback. That is on top of the cost of > implementing the comparison function in javascript. Even in the best > JITs, there is a significant overhead to both these parts. It would only have to be run once per row (object?) modified. Not run at all for reads. Would that really be so bad? Also, most authors would be content with built-in CLDR-based sort functions, which could be C++.
Received on Wednesday, 4 May 2011 23:34:20 UTC