Re: [IndexedDB] Spec changes for international language support

2011/3/17 Pablo Castro <Pablo.Castro@microsoft.com>:
>
> From: Jonas Sicking [mailto:jonas@sicking.cc]
> Sent: Tuesday, March 08, 2011 1:11 PM
>
>>> All in all, is there anything preventing adding the API Pablo suggests
>>> in this thread to the IndexedDB spec drafts?
>
> I wanted to propose a couple of specific tweaks to the initial proposal and then unless I hear pushback start editing this into the spec.
>
> From reading the details on this thread I'm starting to realize that per-database collations won't do it. What did it for me was the example that has a fuzzier matching mode (case/accent insensitive). This is exactly the kind of index I would want to sort people's names in my address book, but most likely not the index I'll want to use for my primary key.
>
> Refactoring the API to accommodate for this would mean to move the setCollation() method and the collation property to the object store and index objects. If we were willing to live without the ability to change them we could take collation as one of the optional parameters to createObjectStore()/createIndex() and reduce a bit of surface area...

Unfortunately I think you bring up good use cases for
per-objectStore/index collations. It's definitely tempting to just add
it as a optional parameter to createObjectStore/createIndex. The
downside is obviously pushing more complexity onto web developers.
Complexity which will be duplicated across sites.

However there is another problem to consider here. Can switching
collation on a objectStore or a unique index can affect its validity?
I.e. if you switch from a case sensitive to a case insensitive
collation, does that mean that if you have two entries with the
primary keys "Sweden" and "sweden" they collide and thus the change of
collation must result in an error (or aborted transaction)?

I do seem to recall that there are ways to do at least case
sensitivity such that you generally don't take case into account when
sorting, unless two entries are exactly the same, in which case you do
look at casing to differentiate them. However I don't really know a
whole lot about this and so defer to people that know
internationalization better.

> I don't have a strong preference there. In any case both would use BCP47 names as discussed in this thread (as Jonas pointed out, implementations can also do their thing as long as they don't interfere with BCP47).
>
> Another piece of feedback I heard consistently as I discussed this with various folks at Microsoft is the need to be able to pick up what the UA would consider the collation that's most appropriate for the user environment (derived from settings, page language or whatever). We could support this by introducing a special value that  you can pass to setCollation that indicates "pick whatever is the right for the environment's language right now". Given that there is no other way for people to discover the user preference on this, I think this is pretty important.

I would be fine with this as long as it's a explicit opt-in. There is
definitely a risk that people will do this and then only do testing in
one language, but it seems to me like a useful use case to support,
and I don't see a way of supporting this while completely avoiding the
risk of internationalization bugs.

/ Jonas

Received on Friday, 18 March 2011 02:20:04 UTC