Re: [Alarm API] data … yet another DB? from Jonas Sicking on 2013-02-15 (public-sysapps@w3.org from February 2013)

From: Jonas Sicking <jonas@sicking.cc>
Date: Fri, 15 Feb 2013 01:18:56 -0800
To: Marcos Caceres <w3c@marcosc.com>
Cc: public-sysapps@w3.org
Message-ID: <CA+c2ei8ms_+M0AhAPdOmenei03gy7ZNBVGTVc471Wb2nD23nvg@mail.gmail.com>
On Fri, Feb 15, 2013 at 12:13 AM, Marcos Caceres <w3c@marcosc.com> wrote:
>
>
>
> On Friday, 15 February 2013 at 05:23, Jonas Sicking wrote:
>
>> On Tue, Feb 12, 2013 at 3:16 AM, Marcos Caceres <w3c@marcosc.com (mailto:w3c@marcosc.com)> wrote:
>> > I've been looking at the "data" option of this API and I'm a bit concerned that it adds yet another database/datastore to the platform. I'm wondering why localStorage and/or IndexedDB are insufficient that this API requires its own long lived datastore? Given that Alarm objects as currently specified have a long-lived identifier, can't that identifier be used to associate the alarm's data in either localStorage or IndexedDB?
>> >
>> > I could see the rationale for sending data to the platform if the platform was then able to do something meaningful with that data. For example, if the data contained a message, and that message was displayed to the user independently of the application (although that doubles as a notification or just using an alert).
>> >
>> > I've also noted some of the privacy concerns I have with this additional data store [1]. Having data only stored in the currently designated places of the platform allows us to better manage privacy as well as gets rid of yet another potential attack vector.
>> >
>> > [1] http://lists.w3.org/Archives/Public/public-sysapps/2013Feb/0019.html
>>
>> The data is "just" a convenience function (just like the .then() function ;-))
> Again, I was not advocating DOMFuture … I'm just trying to avoid another tirade of angry developers screaming "this sucks" ;)
>> It's definitely true that we could require that the application uses
>> the returned ID as a key into a database which stores the information
>> elsewhere.
>>
>> However it doesn't simplify the implementation meaningfully to remove
>> the data argument. In either case the implementation will have to
>> store the date and the ID into a database backend of some sort. So the
>> addition of an extra string argument (the JSON serialized value)
>> doesn't change what actions the implementation takes.
>
> It may not (and that's an implementation detail), but you are forcing everyone else to who might implement this into creating a new database store that also needs to cope with the "data". It also pollutes the platform, adds a new place for privacy issues, more complexity, etc. The fact that we are adding a persistent key for this API is a necessity, but adding data to the end of that is not.

The API is already forcing anyone that is implementing this API to
implement a new database. There is no way to implement this API
without doing that. Once this database has been created out of
necessity, adding an extra column in that database seems like very
little additional implementation complexity.

Can you describe an implementation where this is not the case.

Regarding the privacy issue, see below.

>> In either case we are stuck with an additional database backend of
>> some sort. With all the privacy and performance implications that come
>> with that.
>
> What's the privacy implication if storing a key that refers to an alarm?

It shows that the user is using that app/site. It likely will show
private information based on what date the alarm is scheduled for. For
example I might be able to tell what time you usually get into work by
seeing what time certain reminders are scheduled for.

For an calendar app I would be able to see which times of which days
you have meetings, even if I couldn't see the title of those meetings.
For an alarm clock app I could see what time you usually get up in the
morning. If you have a birthday-reminder app, I might be able to see
what days your friends birthdays are, which could help pinpoint your
identity if combined with a social graph from facebook.

Definitely enough information that you'd want to hook it up to
whatever "clear private data" feature you have in your platform.

>> For example whether we make the API sync or not is
>> unaffected by if it takes a data argument.
>
> That's not what is in question. Adding all the DOMRequest baggage to make this work sanely is what is in question.

These seems like two contradictory sentences. The "DOMRequest baggage"
comes because the API is asynchronous. So if you are calling that into
question then you are calling into question if the API should be sync
or not.

> (as well as other parts of this API that are inconsistent - like sometimes working with Alarm objects and other times working with IDs).

This seems orthogonal to the rest of this thread. But to answer your
question I don't think we should have Alarm objects at all.

> This API is just a slightly more fancy version of setTimeout() - I'm asking us to explore alternatives to making this easier to use.

It currently is specced that way, but that's not the intent of the API
at all and it's a critical problem in the current draft. The whole
point of the API is to wake an application up at a specified time if
it's currently not running. Anything else can and should be solved
using setTimeout.

>> The reason we made the .add
>> function async in FirefoxOS was two-fold:
>>
>> 1. We could do the writing asynchronously after having returned an ID,
>> however that would remove the ability to guarantee that the data was
>> persisted before indicating success to the caller. I.e. we would have
>> no recourse if writing the entry to the database resulted in an IO
>> error.
>
> Sure. That is not in question.

Well, it was suggested that we could make the API sync if we didn't
have data. I'm pointing out that this is not correct.

>> 2. Generating a unique ID can be hard to do synchronously once you
>> consider that an application might be running in multiple processes at
>> the same time. You couldn't for example use a simple counter.
>
> Sure, but there are many alternatives. What are you using in B2G?

I don't know off the top of my head.

>> Likewise, all the questions about "how does this interact with a
>> 'clear private data' feature" remain even if no data argument is
>> written to disk.
>
> Sure, but not relating to the data itself.

I don't understand what you mean by this.

>> I definitely think this is worth it given the saved
>> complexity of having to keep two databases in sync (the Alarms API and
>> the localStorage/IDB API).
>
> I respectfully disagree. For the reasons I've outlined (privacy, complexity, and consistency), I think the platform should not have additional/unnecessary data stores. If the use cases can't be met with localStorage and IndexedDB, or data actually played some meaningful role in the system, I could be convinced otherwise.

As described above. This isn't an unneccesary data store. At the most
you could argue that the data store could be smaller if we didn't have
the data. Which would be an argument for putting a limit on how large
JSON objects that could be stored rather than an argument for dropping
the data completely.

The data absolutely plays a meaningful role. It makes it dramatically
easier to implement a calendar app for example since you can in the
data argument specify information about if the alarm is a "display a
notification to the user" alarm, or a "sync data with server" alarm.
And for the "display a notification to the user" you can include put
the main identifier for the event in the data.

Without a data argument every app which uses the data API will
basically have to have a parallel store to key id->data which makes
usage of the API a pain.

Also keep in mind that removing the data argument won't actually cause
less privacy sensitive information to be stored. It'll just be stored
elsewhere. And since you have to hook up the alarm database to the
"clear private data" feature, the implementation wouldn't be simpler
either.

/ Jonas
Received on Friday, 15 February 2013 09:19:54 UTC