RE: [IndexedDB] Spec changes for international language support

From: Jonas Sicking [mailto:jonas@sicking.cc] 
Sent: Friday, March 18, 2011 1:57 PM

>> >>> However there is another problem to consider here. Can switching
>> >>> collation on a objectStore or a unique index can affect its validity?
>> >>> I.e. if you switch from a case sensitive to a case insensitive
>> >>> collation, does that mean that if you have two entries with the
>> >>> primary keys "Sweden" and "sweden" they collide and thus the change of
>> >>> collation must result in an error (or aborted transaction)?
>> >>>
>> >>> I do seem to recall that there are ways to do at least case
>> >>> sensitivity such that you generally don't take case into account when
>> >>> sorting, unless two entries are exactly the same, in which case you do
>> >>> look at casing to differentiate them. However I don't really know a
>> >>> whole lot about this and so defer to people that know
>> >>> internationalization better.
>> >
>> > This is a good point. It makes me lean toward not allowing changing the collation of an index or store. That means we could just have an optional parameter (in the generic parameter object thingy we have now) on createObjectStore and createIndex that indicates the collation name. It seems minimally disruptive, it doesn't tax people that don't care about it, and since there is no setCollation we don't have the problem of not being able to re-index the data.
>>
>> So there is no way to specify things such that the collation doesn't
>> affect unique-ness? If so, I tend to agree.

The problem is that different collations will consider different things unique. This is bound to be variable across languages and such, so I'm not sure we want to be in the business of fine-tuning this. It seems that being a bit more restrictive could result in a more robust result overall. If someone really needs to change the collation they can copy the table manually...not great, but if we think it's a corner case it's probably fine.

>> >>> > Another piece of feedback I heard consistently as I discussed this with various folks at Microsoft is the need to be able to pick up what the UA would consider the collation that's most appropriate for the user environment (derived from settings, page language or whatever). We could support this by introducing a special value that  you can pass to setCollation that indicates "pick whatever is the right for the environment's language right now". Given that there is no other way for people to discover the user preference on this, I think this is pretty important.
>> >>> I would be fine with this as long as it's a explicit opt-in. There is
>> >>> definitely a risk that people will do this and then only do testing in
>> >>> one language, but it seems to me like a useful use case to support,
>> >>> and I don't see a way of supporting this while completely avoiding the
>> >>> risk of internationalization bugs.
>> >
>> > I agree, it should be opt-in. I still assume we'll default to binary collation (same if you specify the collation value as null). I was reading the BCP 47 [1] and in section 4.1 "Choice of Language Tag" the item #7 seems to describe what we're looking for. The value "i-default" seems to match our needs close enough, so callers could use that value. Discoverability is not great, but we avoid having to specify something new, and arguably they'll need to read somewhere that this argument is a BCP47-compatible value, and we could put a comment about "i-default" right there.
>>
>> Sounds good to me. Though you seem to have forgotten to include the
>> [1] reference.

Oops, here it goes:
 [1] http://tools.ietf.org/html/bcp47

Received on Friday, 18 March 2011 21:57:46 UTC