Microsoft API Proposal

Today, we are pleased to announce Microsoft’s contribution of
the CU-RTC-Web proposal to the W3C WebRTC working group.

Thanks in no small part to the exponential improvements in broadband
infrastructure over the last few years, it is now possible to leverage the
digital backbone of the Internet to create experiences for which dedicated
media and networks were necessary until not too long ago.

Inexpensive, real time video conferencing is one such experience.

The Internet Engineering Task Force and the World Wide Web Consortium
created complementary working groups to bring these experiences to the most
familiar and widespread application used to access the Internet: the web
browser. The goal of this initiative is to add a new level of interactivity
for web users with real-time communications (Web RTC) in the browser.

While the overarching goal is simple to describe, there are several
critical requirements that a successful, widely adoptable Web RTC browser
API will need to meet:

* Honoring key web tenets – The Web favors stateless interactions which do
not saddle either party of a data exchange with the responsibility to
remember what the other did or expects. Doing otherwise is a recipe for
extreme brittleness in implementations; it also raises considerably the
development cost which reduces the reach of the standard itself.

* Customizable response to changing network quality – Real time media
applications have to run on networks with a wide range of capabilities
varying in terms of bandwidth, latency, and packet loss.  Likewise these
characteristics can change while an application is running. Developers
should be able to control how the user experience adapts to fluctuations in
communication quality.  For example, when communication quality degrades,
the developer may prefer to favor the video channel, favor the audio
channel, or suspend the app until acceptable quality is restored.  An
effective protocol and API should provide developers with the tools to
tailor the application response to the exact needs of the moment.

* Ubiquitous deployability on existing network infrastructure –
Interoperability is critical if WebRTC users are to communicate with the
rest of the world with users on different browsers, VoIP phones, and mobile
phones, from behind firewalls and across routers and equipment that is
unlikely to be upgraded to the current state of the art anytime soon.

* Flexibility in its support of popular media formats and codecs as well as
openness to future innovation – A successful standard cannot be tied to
individual codecs, data formats or scenarios. They may soon be supplanted
by newer versions that would make such a tightly coupled standard obsolete
just as quickly. The right approach is instead to support multiple media
formats and to bring the bulk of the logic to the application layer,
enabling developers to innovate.

While a useful start at realizing the Web RTC vision, we feel that
the existing proposal falls short of meeting these requirements. In
particular:

* No Ubiquitous deployability: it shows no signs of offering real world
interoperability with existing VoIP phones, and mobile phones, from behind
firewalls and across routers and instead focuses on video communication
between web browsers under ideal conditions. It does not allow an
application to control how media is transmitted on the network. On the
other hand, implementing innovative, real-world applications like security
consoles, audio streaming services or baby monitoring through this API
would be unwieldy, assuming it could be made to work at all. A Web RTC
standard must equip developers with the ability to implement all scenarios,
even those we haven’t thought of.

* No fit with key web tenets: it is inherently not stateless, as it takes a
significant dependency on the legacy of SIP technology, which is a
suboptimal choice for use in Web APIs. In particular, the negotiation model
of the API relies on the SDP offer/answer model, which forces applications
to parse and generate SDP in order to effect a change in browser behavior.
An application is forced to only perform certain changes when the browser
is in specific states, which further constrains options and increases
complexity. Furthermore, the set of permitted transformations to SDP are
constrained in non-obvious and undiscoverable ways, forcing applications to
resort to trial-and-error and/or browser-specific code. All of this added
complexity is an unnecessary burden on applications with little or no
benefit in return.

The Microsoft Proposal for Customizable, Ubiquitous Real Time Communication
over the WebFor these reasons, Microsoft has contributed
the CU-RTC-Web proposal that we believe does address the four key
requirements above.

* This proposal adds a real-time, peer-to-peer transport layer that
empowers web developers by having greater flexibility and transparency,
putting developers directly in control over the experience they provide to
their users.

* It dispenses with the constraints imposed by unnecessary state machines
and complex SDP and provides simple, transparent objects.

* It elegantly builds on and integrates with the existing
W3C getUserMedia API, making it possible for an application to connect a
microphone or a camera in one browser to the speaker or screen of another
browser. getUserMedia is an increasingly popular API that Microsoft has
been prototyping and that is applicable to a broad set of applications with
an HTML5 client, including video authoring and voice commands.

You can find this proposal at:

http://html5labs.com/cu-rtc-web/cu-rtc-web.htm

Received on Monday, 6 August 2012 19:41:05 UTC