Re: Adding Web Intents to the Webapps WG deliverables from Paul Kinlan on 2011-09-21 (public-webapps@w3.org from July to September 2011)

From: Paul Kinlan <paulkinlan@google.com>
Date: Wed, 21 Sep 2011 01:59:48 +0100
To: Ian Hickson <ian@hixie.ch>
Cc: Rich Tibbett <richt@opera.com>, James Hawkins <jhawkins@google.com>, public-webapps@w3.org
Message-ID: <CADGdg3B_sPPGXYsDANWa-nNxZdWPh+Ls01N6yE5shynZOe3-NA@mail.gmail.com>
Some comments inline - I hope they don't get lost.

On Tue, Sep 20, 2011 at 11:34 PM, Ian Hickson <ian@hixie.ch> wrote:

> On Tue, 20 Sep 2011, Rich Tibbett wrote:
> > Ian Hickson wrote:
> > > Why not just improve both navigator.registerContentHandler and
> > > navigator.registerProtocolHandler?
> >
> >
> http://groups.google.com/group/web-intents/browse_thread/thread/3dff7c2cdf5815b8
> >
> > I tend to agree with rolling this in to RCH and RPH and seeing if we
> > could refine the processing algorithms therein to satisfy the security
> > issues highlighted in that thread (i.e. ensuring the cross document
> > messaging channel setup from a window.open with a registered protocol
> > handler is origin bound).
>
> I'm not sure it necessarily makes sense to use registerProtocolHandler()
> itself, but something based on it seems like it would work, rather than
> reinventing the wheel entirely.
>
>
> On Tue, 20 Sep 2011, Paul Kinlan wrote:
> >
> > Q: Why are the verbs URLs?
> >
> > Verbs don't have to be URL's but a URL will allow us a point of
> > reference to documentation, versioning and namespacing allowing verbs
> > with similar names but by a different provider to not conflict with each
> > other (thus allowing developers to come up with their own schemes and
> > APIs outside of standardisation).
>
> If they're just arbitrary strings, this should be made clearer.
>

We will strongly encourage a URL, so we might have to say it must be a URI
to make developers think of that first. A URL gives us a lot of advantages
(more below on your sharing point).


> > Q: Why are some verbs hard-coded into the API?
> >
> > Convenience and ensuring there is a consistent use of the first set of
> > intents that cover the most common initial use cases.
>
> If the strings are inconvenient enough that you feel the need to provide a
> constant for them, maybe that's a sign that the strings should be changed!
>
> Rather than 'http://webintents.org/share', why not just 'share' for the
> share intent, for example?
>

Providing a single verb will vastly increase the chances that of collision
of competing providers saying they handle the action but provide
a completely different API.  A verb on its own will imply that it is a web
intents verb managed by the webintents project and all the documentation for
that will live under webintents, which means we would then need to think
about standardisation and stewardship for the entire namespace.  We don't
want to become the IANA and have an amazingly formal process (
http://tools.ietf.org/html/rfc4395) of registration.  Rather this can be
lightweight and delegated out to who owns the domain and they can manage
their specs, supported intents and importantly developer support.

If we use URL as the first filter, we have the documentation endpoint and
the namespaces for developers to experiment build upon with little fear of
impacting others.  We have had developers want to hook up transcription
services and OCR services, we worry that without namespacing and the
presence of a strictly formal process people stop building their own API's
and endpoints.

Android is a good example, the intent system is fabulous, if you look at
http://www.openintents.org/en/intentstable most developers end up
reverse name-spacing the intent and I believe when people want to namespace
their API they will either use this syntax or some other inconsistent
naming.  Having a URL is nice and consistent with the web.


>
>
> > Q: How are types matched?
> >
> > I don't know the best phrase, mime-typedness.  A direct string
> > comparison unless there is a * after the / (image/*) which then means
> > every image.
>
> This needs to be defined much more precisely. e.g. what happens to MIME
> parameters? Does */* match foo/*? What syntax checking is done on the
> inputs here?
>

* matches foo/* - we should clear the documentation up.  We didn't think
*/png as an example would make sense so */* is contracted to *.



> > Q: Why not just improve both navigator.registerContentHandler and
> > navigator.registerProtocolHandler?
> >
> > WI encompasses aspects of both of these API's - but more importantly
> > there are some paradigms we wish to bring that will not fit in to either
> > directly.
>
> I'm not saying to actually use navigator.registerContentHandler and
> navigator.registerProtocolHandler, but why not base something on that API
> rather than reinventing the wheel?
>

We have two bits, one is registering the intent which we think is
better declaratively (explain more later in the email) and the second is the
invocation which we believe WI has a far simpler usage of the API and can
take advantage of postMessage.


>
> > We want one consistent way of delivering data to applications
>
> Soemthing based on navigator.register*Handler() would give you that.
> Inventing something new wouldn't, since then you'd have two mechanisms
> (the register*Handler() methods, and Web Intents).
>

Specifically we want to get away from having to wait for the remote app to
tell us it is ready before we pass it the data - which is what happens if we
currently use window.open with a scheme register with RPH RCH etc.

The current code looks a lot like this (which is what we want to get away
from):

var service = window.open('web+share' <http://example.com/'>);
window.addEventListener('message', function (msg) {
   if (service !=== msg.source) { return; }
   if(msg.data == "ready") // let the remote page tell us we are ok to send
it data
   {
     msg.source.postMessage(getData(), msg.origin);
   }
   else {
     doSomethingAwesome(msg.data);
   }
   // we only expect one message in return.
   window.removeEventListener("message", this, false);
}, false);

And replace it with:

var intent = new Intent("http://webintents.org/share",
"image/png", getData());
window.navigator.startActivity(intent), function(data) {
    doSomethingAwesome(data);
});


> > we want it to be a client-side delivery mechanism
>
> If it's going to be entirely client-side, I would recommend using
> MessagePorts, so that the communication isn't just one-post, one-response.
> For example, consider an intent where the provider needs to take the
> submitted data, encrypt it, then hand it back to the original site for a
> signature, then needs to take the result and send it somewhere. With a
> mechanism that just consists of post/response, there's no good way to get
> the data back (you'd have to do it on the server end). If we provide ports
> to communicate back and forth, you wouldn't have a problem.
>

We have the ability to pass MessagePorts through the invocation if required
by the API.  There is an example further down the email.



>
>
> > and we want applications to be able to tell the action it supports and
> > the data types it can handle.
>
> That doesn't seem hard to do if we just work on an extension to
> register*Handler(). No need for a completely different API.
>
>
> > Q: How does an already-open page get to handle an action? e.g. say GMail
> > wants to handle the "share" intent, and the user already has GMail open,
> > and the user clicks a "share" button on Flickr. How does the existing
> > GMail instance get the notification?
> >
> > If it was a share, we would envisage the UI to open the "compose" email
> > function rather than the full interface.
>
> Sure, but what if the full interface is already open? Maybe Google+ is a
> better example: you could envisage the "Share" dialog appearing on the
> open Google+ page rather than opening an entirely new page just to share a
> post.
>
> I think this is an important case to support.
>

We thought about this, we didn't want to overwrite the current task that the
user was performing and we didn't want to have a user launch two intents and
overwrite what they we working on the previous invocation.


>
>
> > Q: In particular, why are intents registered via a new HTML element
> > rather than an API? How do you unregister? How do you determine if the
> > intent was registered or not? How do you conditionally register (e.g.
> > once the user has paid for the service)?
> >
> > External discoverability, indexing and scraping is a very important part
> > of the intent tag understanding the API points of an application is very
> > powerful to external systems - via an API we lose this ability, or at
> > least make it fiendishly hard to discover.
>
> Could you elaborate on this need? What use case does it address?
>

We want to index and discover these apps so that we can provide a way to
offer suggested apps if the user doesn't have an app already installed.  On
webintents.org we would also like to maintain a registry of external apps
where a developer can point us to the URL and we can programmatically
ascertain the supported intents  with ease and as little input from the
developer as possible.

We believe we have more flexibility with how we present registration of
intents to user by using a declarative markup.  The UA will be able to batch
up and aggregate requests to the user to install intents more easily if they
are present in the DOM than if there were arbitrary individual requests to a
register API.

We also reduce the JS API surface by using the tag.

Another discussed option was an intent manifest, however given the
debugging, deployment and maintenance experience we have with AppCache we do
not want to go down the same route.  Additionally adding another request to
each app when it loads increases the burdens and demands on the developers,
the users and their infrastructure.  We would also have to work out how it
integrates with other specs (such as app cache - do we make the manifests
available offline etc) and build the support in for all the touch points.


>
> Adding new elements (especially to <head>) is a high-cost affair and
> should be avoided unless there are really good reasons for it.
>

Sorry, in what sense?  In specification terms or implementation and
performance terms?


> > Flexibility for the UA; the UA gets a much richer understanding of the
> > capabilities of an application, allowing it to have more control of how
> > it presents and manages the intents and registration to the user.
>
> Could you elaborate on this? As far as I can tell it doesn't make any
> difference to the UA whether the registrations are declared via markup or
> via an API.
>

In particular the batching and aggregation of intent information to present
to the user at install time, we can do a lot of things easily if it is in
the DOM.


>
> > If it is available as a tag it allows screen-readers or other
> > accessibility tools to be able to indicate that A) and intent might need
> > to be installed, or that B) the application is a handler for something
> > at load time rather than at some point in the applications lifecycle.
>
> So does an API.
>

Only when it is invoked.


>
>
> > Developer experience is another important reason for the tag: We didn't
> > want the developers to have to think too hard about enabling their apps
> > to receive data; questions developers ask often of an API is "when
> > should I call the code", should it be in onload, before, after? Given
> > our experience working with developers, the more steps or potential
> > questions that are raised in a developers head about implementing
> > functionality, the more likely they are to de-prioritise the work in
> > favor of other work.
>
> I think this dramatically overstates the complexity here. Authors could
> call the API whenever they want; simple guidelines can be trivially given
> in the same amount of room as it takes to tell authors to use a new
> element.


hmm, not sure, I have worked supporting developers on a lot of web apis (to
me they were very simple APIs) and I have heard a lot of them with the
troubles mentioned and they simply de-prioritise the work in favor their
other priorities.


>
> > We considered the cases of the Notifications API, where there is a
> > semi-complex dance of checking for permissions, requesting for
> > permission and then executing the notification - the burden falls on the
> > developer to keep track of whether they think the user has notifications
> > enabled in their app as well as by the UA and the developer just doesn't
> > bother to implement it.
>
> Yes, the notification API is a terrible design IMHO. There is no need for
> permission checks here.
>
>
> > Unregistering would be handled by the UA's management interface directly
> > by the user.
>
> So how does a site that has stopped providing a feature unregister its
> handler for that feature?
>

I would be surprised if apps use any unregister feature.  It is the same
today with the programatic API - apps would have to know they are
deprecating a feature then call an API to remove it and leave that remove
API  in there for a period of time until they think everyone has
unregistered.

We need to update the spec.  The UA should track the pages that declare
intent registration and if the user has the intent installed and the tag is
no longer on the tracked page(s) the UA should remove the registration.


>
> > Conditional registration can be handled on the client or server, it is
> > about the presence of the intent tag - when the tag is in the DOM the UA
> > will pick this up and offer the install prompt (when it determines that
> > it should). This allows you to manage it in a paywall (maybe you have a
> > template that will render the tag in the paywall), or as the result of
> > some client-side action (the developer will add <intent> dynamically to
> > the DOM).
>
> If you are supporting dynamic additions, surely all the points about
> declarative markup that you mentioned earlier are moot.
>

Understood, in this case the primary intention is to not encourage dynamic
registration in the sense of clicking a button to add support for an intent,
but if the UA detected a new <intent> DOM element being added to the <head>
it would act the same as if it was there when it loaded up.


>
> > Determining if the intent was registered - is that in the context of the
> > app knowing the user clicked "install" to a prompt or for external apps
> > to know that there is a handler? In the case of the former, we haven't
> > got that speced - it could be an event on the elements however my
> > question is: why do you need to know? in the case of the latter we
> > didn't want to offer it as part of the API for the same concerns that
> > vendors have expressed for registerProtocolHandler.
>
> On the contrary, vendors have actively asked for it to be provided for
> registerProtocolHandler(). There was a whole thread about this on the
> WHATWG list recently that may be illuminating.
>
>
> Here's a possibly simpler proposal:
>
>  - to register:
>
>   navigator.registerIntentHandler(intent, filter, url, kind);
>     intent = string like "share", "play"
>     filter = MIME type (foo/bar) or MIME wildcard (foo/* or */*)
>     url = page to load to handle intent
>     kind = a flag indicating if the page should be:
>       - always opened in a new tab
>       - opened in the intent handler dialog (inline/pop-up)
>       - opened in an existing tab if possible, else a new tab
>
>  - to check if registered, to unregister:
>
>   navigator.isIntentHandlerRegistered(intent, filter, url);
>   navigator.unregisterIntentandler(intent, filter, url);
>
>  - to invoke intent:
>
>   var port = navigator.handleIntent(intent, filter);
>   port.postMessage(data);
>   port.onmessage = function (event) { handle(event.data) };
>

We thought about this, we didn't want developers to have to worry about when
they think they should start to postMessage, or finish listening to the
onmessage event.

The temptation would be to port.postMessage(data); port.postMessage(data);
and in this case what should happen in the app.

We decided that if we send the data in the start process, if developers
wanted a complex channel to communicate over the passing in a message
channel as data would be an excellent solution:

var channel = new MessageChannel();
window.navigator.startActivity(new Intent("http://chattyintent.com/", "*",
channel.port2));
channel.port1.postMessage(data);
// handle rest of the chatty api.
channel.port1.onmessage = function (event) { handle(event.data) };


>
>  - to handle an intent:
>
>   body.onintent = function (event) {
>     // Check event.origin if you want to be origin-limited.
>     // We could add event.intent and event.filter, for processors that
>     // support many intents.
>     var port = event.ports[0];
>     var response = process(event.data);
>     port.postMessage(response);
>   };
>

We worked on this idea early on in the cycle, and we had issues with the
timing in that if the developer doesn't load it at the correct time should
we buffer the data and present it when they register, if we do that why not
just make it globally available. We wanted the data to be on the intent
object so that it is available for developers at any time that they require
it and not have them have to manually save it off.

In the above example, should we not be attaching onmessage to the port
instead, it seems to conflate the channel idea and the intent loading.  Also
in the case of say an edit intent, the return might be at some point in the
future so the developer would need to stash the intent and port away in an
object somewhere.  Which then looks similar to our defined intent object.


> In simple cases, the above can be even further simplified. For example, if
> you don't care about the response (e.g. for a share intent), and you don't
> have a filter type (so it'll only match */* filters):
>
>   navigator.handleIntent('share').postMessage(data);
>
> Similarly, handling an intent can be even easier than the above for simple
> cases. For example, in the share case again, where you don't send anything
> back but just share whatever is passed in:
>
>   <body onintent="share(event.data)">
>

We thought this initially, that the majority of use case will never care for
a return result. Even in our most basic use-cases developers are using the
return data to track usage for analytics etc.



>
> --
> Ian Hickson               U+1047E                )\._.,--....,'``.    fL
> http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
> Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
>



-- 
Paul Kinlan
Developer Advocate @ Google for Chrome and HTML5
G+: http://plus.ly/paul.kinlan
t: +447730517944
tw: @Paul_Kinlan
LinkedIn: http://uk.linkedin.com/in/paulkinlan
Blog: http://paul.kinlan.me
Skype: paul.kinlan
Received on Wednesday, 21 September 2011 01:00:15 UTC