Re: Security of cross-origin audio from Harald Alvestrand on 2013-05-31 (public-webrtc@w3.org from May 2013)

From: Harald Alvestrand <harald@alvestrand.no>
Date: Fri, 31 May 2013 10:15:26 +0200
To: public-webrtc@w3.org
Message-ID: <51A85C1E.7050607@alvestrand.no>

When addressing deliberate tampering, I'm not sure the case for audio is 
all that different from the case for video; video is also subject to 
copying through the "analog hole" much bemoaned by Hollywood DRM 
advocates - or by pointing a camera at the screen.

Audio does have a much richer potential for accidental retransmission, 
though.

Text in security considerations seems to be appropriate; I can think of 
some technical measures one could take, but they're mostly too grotesque 
(too DRM-like) to be worth considering.


On 05/30/2013 05:06 PM, Martin Thomson wrote:
> When video tracks are combined, the output is still confined to a
> certain set of pixels.  We have a good set of rules with respect to
> cross origin sampling of video and images (any affected pixels can't
> be accessed), but cross origin audio seems to need something broader.
>
> Audio has the wonderful distinction of being very hard to place.
> Spatial effects aside, all audio, regardless of origin, goes to the
> same output.
>
> Maybe a site cannot access the microphone if the speakers are playing
> audio from another origin.  I've noticed that some WebRTC
> implementations have echo cancellation that would not be effective at
> removing audio from sampled output (and this was with headphones).
> I'm certain that this would not be sufficient for any security
> guarantees.
>
> It would be a failure on our part if a site that has access to your
> microphone, but not a remote stream, could recover that remote stream.
>   As it stands, I believe this to be possible.  [1]
>
> Going further, it's going to be difficult for a user to distinguish
> between stuff Joe said and the crap that the site is pumping out the
> speakers.  In WebRTC cases where the browser is required to make
> assertions about the origin of audio and video (not data!), it would
> be bad if the veracity of these assertions can be compromised simply
> by playing other noises and tampering with output levels.  After all,
> if the site knows who you are talking to, generating random expletives
> might be easy.
>
> At a minimum, this needs some security considerations.  Though I would
> be disappointed if that is all that was done.  I would prefer to see
> some mechanisms proposed to address these issues.
>
> --Martin
>
> [1] Bugs make this a lot worse.  I have two sets of headphones
> attached to different audio devices on my computer.  In one example, I
> was using another application to play audio to a completely one set of
> headphones while testing a WebRTC call.  I was able to detect
> phase-shifted audio on the headphones that were being used for the
> WebRTC call.
>

Received on Friday, 31 May 2013 08:15:59 UTC