Action 22: Voice Browser Use Case...

Action 22:  Produce Voice Browser use case... is assigned to me. 

I wasn't involved in the discussion at the time this action was 
generated, so apologies that I didn't know I had an action and I'm going 
in a little blind on the context for the request.  That said, let me lay 
out some of the differences I see in the voice web experience from a 
Desktop browser as they pertain to security.

*Special Considerations for Web Security in a Voice Browser Context
*
/Differences/

    * Interaction with a voice browser is often entirely transparent to
      the end user.  He or she has no idea whether the interaction is
      with a voice browser or any other automated phone application.

    * Voice browsers typically have no standard chrome whatsoever.  The
      entire user experience is defined by the application markup. 
      There is some standard context information provided to the
      application markup (callerid, dialed number) which can be found in
      5.1.4 of the VoiceXML 2.0 specification [1].

    * Voice browsers have no URL bar.  All content must be navigated to
      via hyperlinking.  Bookmarking would be an application-specific
      feature and is not built into the browser metaphor. 

    * A highly interconnected voice web is technically feasible, but
      does not truly exist today.  Applications live in their own space
      and do not contain links outside of their domain.

    * For latency reasons, Voice browser deployments often make use of
      greater presentation markup caching and more separation of dynamic
      data and presentation data.

    * Trust is typically established via the phone number the caller
      dialed.  That said, there is no real reason you can trust the
      phone number.  Trust would be established by the credibility of
      the source of the phone number (corporate website, phonebook,
      toll-free directory assistance.)  Outbound calls are inherently
      less trustworthy. 

    * The search engines in this space are 411 services.  411 data is
      typically maintained by telcos thorugh their whitepages and
      yellowpages which usually involves a direct relationship with the
      business or individual.  It is more difficult to publish yourself
      as a spoof address.

    * The phone network tends to have more centralized control with
      substantially greater regulatory control and legal precedent.  For
      instance, the national do-not-call list generally works where
      attempts to control email spam haven't.

    * The costs of answering a spoofed 800# or placing malicious
      outbound calls are substantially higher than the cost of
      publishing a spoof website or generating an email.

/Similarities/

    * Voice browsers run in a different trust zone than the web services
      and databases.  Billions of calls a year are handled by voice
      browsers operated by one company on behalf of another company. 
      These browsers interpret many different companies' voice
      applications.  As a result a single voice browser may be able to
      access content and services that the application running on that
      voice browser is not allowed to access.  As a consequence, all the
      same sandboxing requirements apply.

    * Protecting user data and preventing cross-session data leaks is
      equally critical.

    * Voice browsers make heavy use of Ecmascript. 

    * User authentication is still the responsibility of the web site
      and not the browser.  Cookies are employed.  Authentication
      techniques differ due to the inability to effectively recognize
      random  a strings of letters, digits, caps, and punctuation that
      is typically found in a typed password.  Biometric
      identification/authentication  (voice prints) are more easily
      integrated into the user experience, though they are not widely
      deployed.


--Brad

[1] VoiceXML 2.0 Standard Session Variables 
http://www.w3.org/TR/voicexml20/#dml5.1.4

Received on Thursday, 30 November 2006 15:46:37 UTC