[INFRASTRUCTURE] Audio and Transcription upgrades

What follows are a few upgrades made to the Jitsi system that will hopefully
fix some of the audio issues that some of us have been experiencing over the
past two months.

Most of these problems have been caused by changes to how browsers manage
audio permissions when playing back audio.  Some of these problems are a
result of new browser bugs introduced as a result of these changes. When and
how web pages are allowed to call certain APIs will be an area of
experimentation by the browser vendors over the next couple of years as they
try to tighten down the privacy and security profile of web browsers.

Meeting Join Screen
===================

A meeting join screen has been enabled where you can select your microphone,
speakers, and camera before joining any CCG meeting. We had this disabled
before because it was not necessary.

New upgrades to Chrome now require an individual to interact with a web page
before that web page can play sound to that person. This was done to prevent
ads from auto-playing audio commercials, but the side effect was that Jitsi is
now blocked (in some cases, depending on your browser version and OS platform)
from playing audio when you don't interact with the page first.

Placing this join screen before a call ensures that everyone will interact
with the web page, thus enabling Jitsi to get access to the audio device and
play back what other people are saying on the call.

If this still doesn't work for you, your audio drops out half-way through the
call, or you talk but other people can't hear you -- just re-load the page and
that should fix the audio for you in most of the cases. Unfortunately, since
these are largely sporadic browser bugs, a Jitsi upgrade won't help us here...
but a page refresh might fix the problem for you.

Transcription Upgrades
======================

The auto-transcription feature is working much better now that we're using a
more financially costly audio AI model. Google has a great business model here
-- "Those are some pretty words you just said... it'd be a real shame if
something were to happen to 'em, pal.") :P

Err, I mean, Google is wonderful and there is nothing wrong with charging good
money for a service that provides great value.

The transcriptions seem to be good enough to replace human scribes at this
point. The quality is not as good as a /good/ human scribe, and the bots
capture every single word that is said (probably too much), but the days of
human transcription seem to be numbered.

I've optimized the transcription bot so it stops recording every single
utterance, so one and two word quips are not recorded now. That was most of
the "clean up" work required for auto-transcribed minutes these days... which
takes less time (at least for me) than dealing w/ a human scribe (the output
from meeting to meeting is far more consistent now).

We still have a few issues with the system... like, the fact that Apple
devices don't seem to care about the Jitsi server telling them to "PLEASE STOP
FIRE-HOSING ME WITH SCREEN SHARE DATA!!!"... So, doing a dual-screen Apple 8K
desktop screen share w/ the system still might bring it to its knees. When you
screen share, please just share a reasonably sized HD-quality window instead
of all two screens of your ultra-wide dual monitor 4K desktop setup at home. :P

We're regularly doing calls with 30 people on them now and the system load
seems to be stable, even with screen sharing, screen recording, and
auto-transcription enabled.

If you have additional concerns or problems with Jitsi, please do log the
issues here:

https://github.com/w3c-ccg/community/issues/229

-- manu

-- 
Manu Sporny - https://www.linkedin.com/in/manusporny/
Founder/CEO - Digital Bazaar, Inc.
News: Digital Bazaar Announces New Case Studies (2021)
https://www.digitalbazaar.com/

Received on Saturday, 19 February 2022 20:22:14 UTC