Re: Enhancing Screen Capture for WebRTC from Elad Alon on 2022-04-28 (public-webrtc@w3.org from April 2022)

From: Elad Alon <eladalon@google.com>
Date: Thu, 28 Apr 2022 10:31:31 +0200
To: Adam Sobieski <adamsobieski@hotmail.com>
Cc: "public-webrtc@w3.org" <public-webrtc@w3.org>
Message-ID: <CAMO6jDNONHNpV0cUcD-TKQX=Wwmx9K1gCBDX19ARfwEJ4jpjNQ@mail.gmail.com>
Hi Adam,

IIUC, you're proposing a stream of events derived from keystrokes be
exposed to capturing applications. Am I right? You have probably considered
the inherent risks, such as how this could effectively be a keylogger, and
users could be tricked into exposing passwords. (Users might also expose
private information by oversight while using non-malicious applications.) I
don't think a checkbox - as you suggest - would be sufficient protection
against such risks. If you have thought of more robust protection for the
user, please let me know.

Some additional thoughts:
* For tab-capture of a tightly-coupled application, it is possible for the
captured tab to report its various events to the capturing tab, which can
in turnannotate the video with this information, record it separately in a
format of its choosing, etc.
* For window- and monitor-capture, I think this is a power-feature best
reserved for native apps - assuming the relevant OS even allows native apps
to monitor arbitrary mouse/keyboard events.

Thanks,
Elad

On Fri, Apr 15, 2022 at 5:02 AM Adam Sobieski <adamsobieski@hotmail.com>
wrote:

> WebRTC Working Group,
>
>
>
> Hello. I would like to describe and to express interest in some new
> features for WebRTC and its screen capture capabilities.
>
>
>
> WebRTC screen capture – or a mode thereof – could include user-input and
> application events, e.g., mouse, keyboard, touchscreen, and stylus events,
> applications’ menu-related events and other application events. Beyond
> streaming video of dynamic screen content, accompanying data tracks could
> include events relevant to screen-captured software applications.
>
>
>
> As envisioned, end-users would be able to opt into such an enhanced
> screen-capture mode during the initialization and configuration of
> screen-capturing, e.g., by selecting a checkbox with accompanying text
> asking the end-user whether they desire to additionally stream user-input
> and application events.
>
>
>
> I present a use case below. I would like to express that that the desired
> features would enable a larger set of use cases than the one indicated.
>
>
>
> A use case is that of intelligent tutoring systems which can teach
> end-users how to better utilize software applications, e.g., office
> software or CAD software. End-users could connect to intelligent tutoring
> systems via WebRTC and perform exercises while interacting with the
> tutoring systems, receiving assessment, instruction, and task-relevant
> hints.
>
>
>
> Without the features under consideration, server-side
> computer-vision-based processing would be required to obtain the visible
> application-specific events from video streams, e.g., end-users opening
> menus and making use of application functionalities.
>
>
>
> In (Grossman, Matejka, & Fitzmaurice, 2010), the authors state that
> “storing a document's workflow history, and providing tools for its
> visualization and exploration, could make any document a powerful learning
> tool.”
>
>
>
> In (Bao, Li, Xing, Wang, & Zhou, 2015), the authors present a
> computer-vision-based video-scraping technique to “automatically
> reverse-engineer time-series interaction data from screen-captured videos.”
>
>
>
> In (Frisson, Malacria, Bailly, & Dutoit, 2016), the authors describe a
> general-purpose tool for observing application usage and analyzing users’
> behaviors, combining computer-vision-based analyses of video-recordings
> with the collection of low-level interactions.
>
>
>
> In (Sadeghi, Dargon, Rivest, & Pernot, 2016), the authors present a
> framework for fully capturing processes of computer-aided design and
> engineering.
>
>
>
> Thank you. I hope that these features for enhancing WebRTC and its
> screen-capturing capabilities are also of some interest to you.
>
>
>
>
>
> Best regards,
>
> Adam Sobieski
>
>
>
> *REFERENCES*
>
>
>
> Grossman, Tovi, Justin Matejka, and George Fitzmaurice. "Chronicle:
> capture, exploration, and playback of document workflow histories." In
> Proceedings of the 23nd annual ACM symposium on User interface software and
> technology, pp. 143-152. 2010.
>
>
>
> Bao, Lingfeng, Jing Li, Zhenchang Xing, Xinyu Wang, and Bo Zhou. "Reverse
> engineering time-series interaction data from screen-captured videos." In
> 2015 IEEE 22nd International Conference on Software Analysis, Evolution,
> and Reengineering (SANER), pp. 399-408. IEEE, 2015.
>
>
>
> Frisson, Christian, Sylvain Malacria, Gilles Bailly, and Thierry Dutoit.
> "Inspectorwidget: A system to analyze users behaviors in their
> applications." In Proceedings of the 2016 CHI Conference Extended Abstracts
> on Human Factors in Computing Systems, pp. 1548-1554. 2016.
>
>
>
> Sadeghi, Samira, Thomas Dargon, Louis Rivest, and Jean-Philippe Pernot.
> "Capturing and analysing how designers use CAD software." (2016).
>
>
>
Received on Thursday, 28 April 2022 08:31:56 UTC