Re: Is there an existing mechanism that can be used for WebIntents? from Rich Tibbett on 2012-01-20 (public-web-intents@w3.org from January 2012)

From: Rich Tibbett <richt@opera.com>
Date: Fri, 20 Jan 2012 21:30:47 +0100
To: Paul Kinlan <paulkinlan@google.com>
CC: public-web-intents@w3.org
Message-ID: <4F19CEF7.1050409@opera.com>
Paul Kinlan wrote:
> Are we proposing registerProtocolHandler Level 2 in this task-force?
> Or is this something for public-webapps?

I believe this is a heads-up at this point. You're right though that 
this is a proposal for another group. I will send this on when we have a 
demo to show.

>
> Also for clarification, this has been discussed before on the
> web-intents Google group [1]

I've consulted this thread many times over the last few weeks. There is 
no mention of allowing custom protocol URLs to act like HTTP resources.

>
> I have attached some more comments inline.
>
> P
>
> [1] https://groups.google.com/group/web-intents/browse_thread/thread/3dff7c2cdf5815b8
>
> On Fri, Jan 20, 2012 at 8:53 AM, Rich Tibbett<richt@opera.com>  wrote:
>> Paul Kinlan wrote:
>>> Hi Rich,
>>>
>>> I know we have talked a bit about this off-list in the past, and it
>>> has also been discussed on other channels.
>> No. This is a completely different proposal to anything we have discussed
>> previously. This is not about making web intents fit in to the existing
>> registerProtocolHandler mechanism. This is a proposal for
>> 'registerProtocolHandler Level 2' in which the main point is to assign full
>> HTTP charateristics to custom web protocols so that they act in the same
>> ways and with all the same rights as HTTP URLs operate today.
>>
>> This simple notion greatly opens up the way that we can communicate via
>> custom protocols. Instead of just being restricted to opening custom
>> protocols in a new tab with query string parameters we get the full-blown
>> abilities that come with HTTP: the ability to POST, the ability to filter on
>> Content-Types, the ability to use Custom Protocol URLs within a web page and
>> within existing web tools.
>>
>>
>>> I will comment in-line as to why we think that this is not the
>>> solution to the problem of connecting apps and building a successful
>>> eco-system.
>> Comments also inline.
>>
>>
>>>
>>> On Fri, Jan 20, 2012 at 5:45 AM, Rich Tibbett<richt@opera.com>    wrote:
>>>> Mike Hanson wrote:
>>>>> Here's another one - Austin King did some experiments with
>>>>> registerProtocolHandler (supported in Firefox and Chrome) to do a
>>>>> similar thing, about a year ago:
>>>>>
>>>>>
>>>>>
>>>>> http://blog.mozilla.com/webdev/2010/07/26/registerprotocolhandler-enhancing-the-federated-web/
>>>>
>>>> Could we not simply endow 'web+' protocols with full HTTP characteristics
>>>> (i.e. the ability to POST content towards custom protocolss) and then
>>>> allow
>>>> developer to use these addresses through either existing web APIs like
>>>> XHR
>>>> and Web Sockets, embed custom protocols directly within DOM elements or
>>>> setup Web Messaging channels by invoking a custom protocol via
>>>> window.open?
>>>
>>> WI is quite an opinionated framework, it tries to say that for the
>>> majority of usecase a simple client-side request/response will work.
>>
>> Web Intents relies on a web page being loaded in a seperate window or in an
>> iframe. Once you have this it becomes a client side mechanism.
>>
>> In the case that a simple channel needs setting up we just do:
>>
>> [client]
>> var intentHandler = window.open('web+imageedit:foo.com/photo.jpg',
>> 'myiframe');
>> intentHandler.onmessage = function( msg ) {
>>   // Only handle a response from the user-selected service
>>   if( msg.source === intentHandler ) {
>>     dump( msg.data );
>>   }
>> }
>>
>> [server]
>> // process image then on complete:
>> window.opener.postMessage({'action': 'complete' /* ... */}, '*');
>> // (window.opener will be set to the client page)
>>
>>
>>
>>> Letting the developer communicate over XHR or Web Sockets or whatever
>>> else is a recipe for developer confusion and thus adoption.  If you
>>> were to build a service for image sharing, how would you tell the
>>> developer that you accepted XHR requests but not Web Sockets (or
>>> vice-versa).
>>
>> It's a URL. This is currently a non-issue on the web and I expect it to be
>> the same for registerProtocolHandler Level 2. How do you inform web pages
>> that URLs are meant for Web Sockets or XHR communication today and avoid
>> conflicts?
>>
>>
>>> Additionally, the web+ protocols are not as descriptive as we would
>>> like, yes you might be able to embed the verb into the scheme, but
>>> then supporting a wildcard set of data-types would be hard to define
>>> at a spec level, awkward to implement at a browser level and a pain to
>>> implement at the web dev publisher level.
>>
>> If you are POSTing to a URL then you can set the Content-Type to whatever
>> you wish. Either we let the service decide if it can handle the content when
>> it receives it or we can let service providers submit a set of Content Types
>> that they support with their registerProtocolHandler registration. I'm more
>> than happy to rely on the former.
>>
>>
>>>> Drawing parallels with Web Intents, you would define the 'action verb'
>>>> within the protocol's schema, draw the content-type and other
>>>> characteristics from the full range of any POST headers. The data would
>>>> be
>>>> sent in the body of a POST.
>>>
>>> This will not work with offline apps at all.  Web Intents is designed
>>> to be able to work completely independently from the server by
>>> handling the registration, invocation, selection and resolution and
>>> subsequently the user action in the service entirely on the client.
>>
>> Web Intents opens a web page as a result of a user's intent handler
>> selection. If that web paged is appcached then it will load and we will be
>> able to establish a direct communication channel according to the snipped
>> provided above. This works offline as much as Web Intents.
>>
>> POSTing to a appcached URL will deliver that POST to the cached resource. It
>> all works the same way.
>
> How?  POST is a statement and is used to create a new object that is
> sent through to a server not a client app.  If the result is cached
> how is the server supposed to process it?

If the result is cached then we would use the same mechanisms as web 
intents enables: sending the data as a transferable object to the cached 
web page.

If the web page is not cached, I would be able to do the following:

<form method="POST" action="web+photoupload:/" target="_blank">
   <input name="data" type="file"/>
</form>

>>
>>> You also talked earlier about Web Sockets.  Is this now included or
>>> excluded from what you are talking about?
>>
>> This is an implicit part of the mechanism, yes. Today we don't have a
>> programmatic mechanism to distinguish what HTTP URLs are exposing. This is
>> the same principle at work here.
>>
>>> Remember that in Web Intents you can register all the remote services
>>> that you offer as a site in one page.  I don't see how this is
>>> resolved.
>>
>> You can do the same with registerProtocolHandler.
>>
>>
>>>> If you wanted to set up a Web Intents messaging channel then you could do
>>>> window.open({customprotocol}), register a messaging listener against the
>>>> returned WindowProxy object and then use the WindowProxy/Window channel
>>>> for
>>>> inter-communication ala Web Intents.
>>>>
>>>> There would be no client-side API - you would invoke custom URLs via GET
>>>> or
>>>> POST to use them directly within existing Web APIs.
>>>
>>> Again this has no chance of working with offline apps or just purely
>>> on the client-side (just a side note, appcache doesn't work well with
>>> query parameters - so you could appcache a URL for GET requests, but
>>> if you want to pass it any data via the query string you need to be
>>> online to do it).
>>
>> This is true. So I'd architect my service to not accept query parameters
>> then use a pre-designed messaging channel format to pass query string
>> parameters. It's a design point.
>
> Ok - so are you saying that this would be part of the spec?  You can't
> have the client app need have to work out if it needs to send via
> query parameter or have to set up a channel to send messages or have
> to encode the data in a POST.

Custom protocol URLs are used invoked from web pages. You can post via a 
form or XHR with the method explicitly set to POST. The resulting POST 
request is sent to the UA, as the custom protocol handler, which then 
simply brokers whatever it receives towards the user-selected service.

>
> With register protocol handler, the Query Parameters are handled via
> %s substitution at the moment.

On the query parameter, if the protocol registration has included %s 
then send through the origin url. This is no change on the current rPH 
implementations. Going back to the cached resource, you couldn't use %s. 
You would set up a messaging channel to the resource via window.open as 
demonstrated above.

>
> This is less of a web-developer design issue and a fundemental part of the spec.
>
>>
>>> On another note, there are a lot of issues with resolving where the
>>> messages come from using the window.onmessage API.  Imagine you have a
>>
>> See example above for resolution. Also, you can add the following to the
>> server-side of the transaction.
>>
>> window.onmessage = function( msg ) {
>>   if( msg.source !== window.opener) {
>>     // trash messages not sent from the opener
>>   }
>>   doSomethingWithClientMessage();
>> }
>>
>> This is really simple on either end of the transaction.
>
> See the converstaion in [1] - this is managing for the service only,
> now add in the filtering to ensure that it is a message of the correct
> type and all your other protocol code etc it gets a lot more complex
> than the snippet.   Actually it might even be  a little more complex
> because in the current model of postMessage, in the opened window you
> have to tell the client (window.opener) that you are ready to receive
> messages because messages in the service aren't handled until post
> onload (var w = window.open("web+someaction://blah", "mywindow");
> w.postMessage({})  -- this will not work, you have to know when the w
> has loaded to get postMessage to work).

This is not an issue directly imposed by this proposal. We will need to 
fix these issues as part of current standardization activities 
regardless of whatever mechanism we end up with here.

>
>>
>>> share button, the user clicks it twice it opens two windows - because
>>> the RPH system will resolve the URL to open the client app has no way
>>
>> Calrification, if an in-page link is clicked or window.open is used against
>> a custom protocol then it opens a separate web page.
>
> The above code you have manages one action at a time with usage of
> window.onmessage - anyway your code is managing the service side which
> I am not talking about, I am saying the responses in window.onmessage
> on the client need to keep track of the windows that it has opened and
> the response's it has received and that is just a whole lot of pain
> and boilerplate for developers.
>
>> If a custom protocol is used in an XHR, WS object then it doesn't require
>> the opening of another web page
>>
>
> We envisage that most users will want to see a UI of the service that
> they are interacting with and that they have chosen.  This does raise
> a point that I believe Mozilla were interested in for being able to
> have a dynamically driven contact picker being rendered in the client
> from a remote data source (without a remote UI)

The way to use a service gets left entirely to the service providers. If 
they need to show a UI then clients should use window.open. If they need 
to obtain POST data and show UI then clients should submit data via HTML 
forms. If a a service needs to POST data and doesn't need to show a UI 
then clients should submit data via an XHR request. If a web page needs 
to GET data but doesn't need to show a UI then it can GET that data via XHR.

The point here is that these aspects - what a URL is to be used for - 
are entirely uncoordinated by standards and we should keep that 
flexibility built in here.

>
> Would you envisage that every call to
> XMLHttpRequest("web+blah://asdasd") open up a picker to resolve the
> service.

Which ever model you decided on for Web Intents could equally be applied 
here.

>
> One thing I would like to see is a consitent implementation for form
> submit - i.e a no JS implementation to send data to an app from form
> fields.

You should be able to use a custom protocol in a <form> field's 'action' 
attribute (see above)?

>
>>> of knowing the app/url that was opened, therefore when the remote
>>> windows postMessage back to the client how do you resolve for which
>>> user action the response was for - from my experimentation you can't
>>> easily (you have to track window objects and keep them maintained in a
>>
>> For each click I maintain the returned result of the window.open call in a
>> different parameter. I can distinguish between the two really easily.
>>
>> <a href="web+videoview:example.com/videos/test.webm"/>
>> <script>
>>   var clicks = [];
>>   var anchor = document.getElementsByTagName('a')[0];
>>
>>   anchor.addEventListener( 'click', function( evt ) {
>>      evt.preventDefault();
>>
>>      var intentHandler = window.open( anchor.href );
>>
>>      clicks.push(intentHandler);
>>   }, false);
>>
>>   window.onmessage = function( msg ) {
>>     for(var i in clicks) {
>>       if(msg.source === clicks[i]) {
>>         console.log( "Message is for event #" + i);
>>
>>         // do something with message for click[i]
>>
>>         continue;
>>       }
>>     }
>>   }
>> </script>
>
> Do you not think that is a lot of code and boilerplate?  This is my
> point I was trying to make.

The above is only one method of using this. You could equally 
communicate with another resource via HTML elements such as <a>, <img>, 
<form>, <script>, etc.

>
>> On the server-side each window.opener object should point to a different
>> WindowProxy object of the initiating window. I'd be much more interested in
>> solving this as a general problem for window.open than implementing a new
>> API.
>>
>>
>>> global variable).   Imagine having to deploy this code to a blog just
>>> to support share buttons?  It is pretty complex and will be a huge
>>> maintenance nightmare.
>>>
>>> Can we just clear up one point - you mention in the earlier paragraph
>>> use a message listener then in the next you say use POST and GET to
>>> send to the server and say there is no client-side API - this reads as
>>> a contradiction?  How does the message and response get back to the
>>> invoked client app - say I want to edit an image in the browser, I
>>> open it up with window.open and encode the data (via JS to pass into
>>> window.open - maybe this would be form submit instead), I then edit
>>> the image in my favourite app and then what? window.postMessage back
>>> the data?  It can't come back in the response because we can access
>>> any data on the newly opened window reference - now we have client Js
>>> sending to the remote server data that will be sent back to the client
>>> via a mechanism such as postMessage.  This is hard and confusing.
>>
>> This is how communication to other resources on the web works today.
>>
>> A custom protocol becomes equivalent to a HTTP URL and you can do whatever
>> you like with it. The UA sits in the middle to broker access to individual
>> services at the user's request and proxies the request from the client to
>> the server and back again. There is no concept of an XHR URL or a POST URL
>> or a GET URL. It's a URL and you do want you can with it.
>
> Sure.  But how do you know what to do with it.  If you have a protocol
> web+share, me a client developer how do I know I should issue a POST
> or I use a Web Socket to it?  How do I know that I should open a
> window or interact via XHR?

The same way you know what to do with any URL on the web. For example: 
https://www.googleapis.com/latitude/v1/location/12345?key=12345 has 
well-defined semantics for usage (HTTP GET/POST/DELETE, JSON/XML data 
formats). This is one of billions of entirely uncoordinated URLs that 
work seamlessly together to create the web.

_If_ we need to differentiate Web Sockets, as a special case we have a 
whole naming system for that already in place e.g. 'ws+{servicename}'.

>
>>
>>>> For POSTing to custom protocols: each schema would be free to design and
>>>> share their own format that, at the simplest level, would be a JSON
>>>> structure that those services registered for the custom protocol share.
>>>
>>> ick.  seriously?  Whilst web intents allows this, it is pretty
>>> constrained in saying that for the majority of cases a
>>> request-response semantics are the preferred solution.  That is you
>>> send some data of the type defined in the registration to the service
>>> and your receive it back all in the clientside.
>>
>> Web Intents allows you to send a structured data object to the service side.
>> HTTP allows you to do that too. People who use XHR do this all the time. So
>> yes. Seriously.
>
> I am saying Ick is because you said that JSON is the simplest format -
> why not send the raw data?  It would fit more with what you are
> talking about - you mention you want to support POST, use www-encoding

Yeah, you could send raw data or key/value pairs or JSON. Note that I'm 
just describing URLs in general at this point.

>
>>> Do we really want to get into the process of defining a new spec for
>>> sharing a URL with associated JSON data structures?

The web is one big unorganized experiment. It works pretty well.

>>
>> No we don't. This would be a decentralized process much like how
>> communication formats and protocols are established on the web today.
>
> Ok - so the WI solution at the moment defines part of the process for
> request and response formats.  You agree on the data you are going to
> send and you send it.  More complex protocols will be managed in the
> same way you described and are being discussed on the list now.

To date Web Intents has simply not shown itself capable of handling more 
complex use cases that have been brought to this group. I hope to see 
something that moves away from the "RPC or something" approach adopted 
so far.

>
>>
>>> WI says that if you say you will send an image you send the image
>>> object, it can either be a URL to the image or a representation of
>>> that in the client (a dataURI or Blob for instance).
>>
>> You can send objects when you invoke a custom protocol via window.open and
>> then the the transfer argument of the web messaging channel returned with
>> that object.
>
> Yup, I know.  The model is horrendously hard and tedious for
> developers to implement using RPH alone, which is why we want WI.

I'm not sure I agree but if we wanted to reduce the boilerplate we could 
provide more methods on DOM. This doesn't have to be tied to the Web 
Intents methods though. The final APIs could look very similar to the 
Web Intents APIs, using the registerProtocolHandler Level 2 backend.

>
>>
>>>> No changes would be required on the server-side registration of a custom
>>>> protocol handler.
>>>
>>> There are no changes with WI either.  Unless you are talking about a
>>> server side change being updating a template to include the HTML (in
>>> which case we do need one change) and I think this would still happen
>>> with your suggestion as well.
>>
>> I mean no changes at all. A service uses the registerProtocolHandler method
>> in the usual way and as implemented in three browsers already.
>>
>>
>>> Just as an added FYI, part of the intent spec we are currently looking
>>> at is unifying the declaration of RPH and RCH and WI via the intent
>>> tag, so that you can declaratively register a content handler via an
>>> html element maybe of the format<intent type="image/*" />, this being
>>> less verbose that the associated JS, but also benefits from mime-type
>>> globbing etc.... We haven't specced it out yet, but it could be very
>>> powerful.
>>>
>>>> The major advantage of building on HTTP would be two-fold:
>>>>
>>>> 1. Extend the reach of the web democratization revolution. Developers
>>>> could
>>>> invoke custom protocols from e.g. a native application, select a service
>>>> handler and get forward to their chosen web app. A full HTTP-like
>>>> mechanism
>>>> embodies all of the principles of Web Intents but also has the ability to
>>>> reach far beyond the web to 'things' that can be connected to the web and
>>>> the web that can connect to native apps and networked connected devices
>>>> and
>>>> services.
>>>
>>> Are you saying that WI doesn't allow this?  Both RPH and Intents are
>>> focused on letting the user resolve the services that they wish to use
>>> for the action (or in RPH's case the protocol name).  The implication
>>> that a full HTTP model resolved by RPH will solve everything is
>>> patently false.  You want to send data to a native app on the users
>>> machine now that App has to understand how to process HTTP Post
>>> requests and how to respond to them - I also worry how does the UA
>>> define a consistent API to call the app give it the data and then
>>> expect a response back - can we build this native API/bridge
>>> consistently so the native app developer doesn't have to build a
>>> Chrome Bridge, a FF bridge and Opera bridge etc?
>>
>> To all intents and purposes a web application can treat a custom protocol
>> url as a HTTP resource. Who _doesn't_ know how to deal with that today?
>
> Again given that the intent/protocol hides the implementation from
> client developer and service developer how are the two supposed to
> mediate POST/GET/WS/XMLHttpRequest?  This I feel is a very very very
> hard problem to define but also implement as a publisher.

I touched on this above.

>
>>
>>> Also, registerProtocolHandler is not HTTP like for anything other than
>>> encoding the data in a url, there is no real request, you don't send
>>> across HTTP headers to the native app, you never get a response back.
>>
>> This is absolutely the purpose of the proposal here: to give custom web
>> protocols all of the characteristics of HTTP URLs to enable all of the use
>> cases currently being covered by Web Intents.
>
> We haven't said that API's couldn't partly be covered by RPH, heck we
> have a Javascript Shim in place today that orchastrates a lot of what
> we are trying to do.  We are saying that the model we are defining has
> a superior model for users and developers around action and task based
> flows than RPH and RCH alone.
>
> WI fits the gap left by RPH and RCH for defining actions on data.  RCH
> is for viewing data of a particular mime-type and RPH is for opening
> document specified by a scheme from a location (or of an ID) - WI is
> for actions on data. We strongly want to push for a declarative method
> of defining intents so that they are discoverable by external agents.

You don't want to discover custom protocols and contents handlers 
though? Since the biggest web spiders speak JS at this point, or can at 
least read JS, this doesn't seem like a JS vs HTML issue.

>
> I will say we are not looking to replace RCH and RPH, but instead try
> and unify some of the thinking and if possible give developers a
> declarative method to define the protocol or the data-type in the
> intent tag.
>
> I have done a lot of experiments and apps with RPH and the code
> invariably gets very very messy very quickly for trying to do action
> based work consistently across clients and multiple services that
> don't know about each other.

As mentioned above a lot of the boilerplate can be added as a supplement 
to another proposal if it's too difficult for developers. HTML5 is about 
bringing complicated JS processes in to the browser. You could argue 
that cross-document messaging was hard before Web Messaging came out. 
Streaming data was hard before Web Sockets. I'm proposing a building 
block to more services and uses and any commonly used hacks that devs 
use on the web will sooner or later end up as web standards, built 
natively in to the browser to make their lives easier.

>
>>
>>>> 2. Reverse CORS. By invoking a custom protocol, developers can
>>>> communicate
>>>> with services residing on different origin servers. The authorization for
>>>> this comes from the user implicitly authorizing the cross-origin
>>>> communication by selecting a service provider from their UA-based custom
>>>> protocol handlers list.
>>>
>>> Yes, and Web Intents does this too (in the client), with a constrained
>>> action resolution and the added ability to also filter based on data
>>> type if required.
>>
>> We could look in to data types. This idea isn't going away. This is, in
>> effect, a candidate proposal for 'registerProtocolHandler Level 2'. We could
>> discuss if and how to specify filters based on data type in that initiative.
>> This would be a case of adding a single option to registerProtocolHandler().
>
> I would keep protocol handler just handling what it handles now.  It
> has the ability to open up a window (letting it work in an iframe
> seems a world of pain for security).  The reason why I say this is
> that protocols (really they are schemes) such as cal, tel, http,
> mailto are pointers to an external resource encoded - i.e, this uri
> points to a calendar entry.  I would not want to conflate the protocol
> and the action of "edit", "pick" for example with the protocol API.
>

I don't understand what you mean here. I think point towards a resource 
with a unique, structured identifier gives us an equivalent model 
without the mess of having to manage action verbs.

- Rich

>>
>> Thanks for your feedback. This is a good discussion. I feel that we may not
>> solve anything on this Web Intents list but I wanted to communicate how this
>> problem could be solved by improving on what we already have in the web
>> platform without needing to add more APIs or start any new initiatives.
>>
>> Cheers, Rich
>>
>>
>>>
>>>> br/ Rich
>>>>
>>>>> -mike
>>>>>
>>>>>
>>>>> On Jan 15, 2012, at 10:23 AM, Paul Kinlan wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> This was something that I started to document under
>>>>>> http://webintents.org/subscribe - the intents discovery mechanism in
>>>>>> the spec doesn't preculde a UA from detecting this and allowing the
>>>>>> user to invoke an action to subscribe to the feed using their
>>>>>> preferred application.
>>>>>>
>>>>>> P
>>>>>>
>>>>>> On Fri, Jan 13, 2012 at 4:48 AM, Mike Kelly<mikekelly321@gmail.com
>>>>>> <mailto:mikekelly321@gmail.com>>    wrote:
>>>>>>
>>>>>>     Hi,
>>>>>>
>>>>>>     I was wondering whether an example of 'web intent' behaviour has
>>>>>>     already existed for some time:
>>>>>>
>>>>>>     The example I am thinking of is driven by atom/rss links in the head
>>>>>>     of HTML pages, i.e. an html page containing the following link in
>>>>>> the
>>>>>>     head of the document..
>>>>>>
>>>>>>     <link rel="alternate" type="application/rss+xml" href="...." />
>>>>>>
>>>>>>     ... this causes a browser (e.g. Firefox) to present the user with
>>>>>> the
>>>>>>     option to 'Subscribe to This Page' where the user can fulfil their
>>>>>>     'subscription intent'.
>>>>>>
>>>>>>     Would this be considered an equivalent of a web intent?
>>>>>>
>>>>>>     Cheers,
>>>>>>     Mike
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Paul Kinlan
>>>>>> Developer Advocate @ Google for Chrome and HTML5
>>>>>> G+: http://plus.ly/paul.kinlan
>>>>>> t: +447730517944
>>>>>> tw: @Paul_Kinlan
>>>>>> LinkedIn: http://uk.linkedin.com/in/paulkinlan
>>>>>> Blog: http://paul.kinlan.me<http://paul.kinlan.me/>
>>>>>> Skype: paul.kinlan
>>>>>>
>>>
>>>
>
>
>
Received on Friday, 20 January 2012 20:31:20 UTC