- From: Melvin Carvalho <melvincarvalho@gmail.com>
- Date: Mon, 5 Jan 2026 17:15:17 +0100
- To: Manu Sporny <msporny@digitalbazaar.com>
- Cc: W3C Credentials CG <public-credentials@w3.org>
- Message-ID: <CAKaEYh+EA7kq+Twz=F_vh-T=ty0=8=b_9BsihAcA7uC+7eFytQ@mail.gmail.com>
po 5. 1. 2026 v 16:41 odesÃlatel Manu Sporny <msporny@digitalbazaar.com> napsal: > Hey folks, > > The purpose of this email is to document how our latest > transcription/archival system works. This is an attempt to reduce the > amount of tribal knowledge needed to operate the infrastructure for > our community. > > Some of you might have noticed that our meeting transcription summary > emails got stuck during the last half of December 2025. This was due > to Google deprecating the gemini-2.0-flash-lite model causing the > summary emails to fail. The summarizer has been updated to > gemini-2.5-flash, so everything should be working again. The archival > process didn't fail; meetings continued to be archived. The step that > generates the transcript summary and sends it to the mailing list is > the thing that failed. I expect the AI parts of the infrastructure to > keep breaking due to the "move fast and break things" nature of the > companies deploying LLMs. That was the fourth such breaking change > made just last year to the APIs -- we'll continue to fix things as > they get broken; the instability pain is worth having to not use human > scribes. > > While fixing that, I also took some time to reduce some "bus factor" > in the infrastructure. The archival process has been running on my > personal machine for the last year (so I could debug issues as they > arose, and because the way these Google/LLM APIs work were a total > pain to put into Github Actions). All that said, we now have a Github > Action that will run every weekday at 6:30pm ET to perform any meeting > archival for the day. The process, with some successful runs, can be > found here: > > https://github.com/w3c-ccg/w3c-ccg-archiver/actions/workflows/archive.yaml > > That uses the general CG Archival tool that can be found here: > > https://github.com/w3c-ccg/cg-archiver/ > > We need better documentation on the whole setup, but it's all > automated and running in Github Actions now. In theory, someone else > could pick it up and improve it from here. > > No action is required by anyone at this point. Just providing an > update in case others wanted to improve the current set up. > Thanks alot for sharing Manu I'd like to start using things like this in other groups. Do you have thoughts on what is currently the best transcription service? I had some quite good experience with : https://elevenlabs.io/audio-to-text But it was not the best at figuring out who was talking. Would love to hear experiences on this topic, as almost every group at the w3c needs it. > > -- manu > > -- > Manu Sporny - https://www.linkedin.com/in/manusporny/ > Founder/CEO - Digital Bazaar, Inc. > https://www.digitalbazaar.com/ > >
Received on Monday, 5 January 2026 16:15:34 UTC