286 reads

Quarantine Day 22 Amirite? My Take on Zoombot

by Adnan AgaApril 9th, 2020

Too Long; Didn't Read

This morning I stumbled onto a Gizmodo article talking about a guy (Matt Reed) who made a digital AI twin that replaced him on a Zoom call and would participate in conversation. After doing some research and digging into Matts’ code, I made a version of his own. I detect keywords and playback time using the same Artyom library, but instead of images, I use prerecorded clips. I then added all the clips I recorded above into OBS. I created a simple webpage that would listen for audio from Google Hangouts or Zoom and would tell OBS what scene what to play based on the questions I was asked.

People Mentioned

Company Mentioned

featured image - Quarantine Day 22 Amirite? My Take on Zoombot

EDIT: Edit: I put the instructions on Github!

Added my code to This morning I stumbled onto a Gizmodo article talking about a guy (Matt Reed) who made a digital AI twin that replaced him on a Zoom call and would participate in conversation.

I thought this was awesome and wanted to take it a step further — Matt’s system works by listening to what was said on the call and parsing out the key word/phrases with this awesome Artyom Library. Then it uses Artyom’s Speech Synthesizer to speak out the responses while changing the image to a different head pose or mouth position.

Pretty damn cool.

Then I started to think of all the things I usually say on calls. Especially now that I’m working from home a lot more — stuff like

“Hey are you still there”

“Your connections’ breaking up”

“What do you think”

“Do you agree ?”

So I thought of what my responses would be to those. After doing some research and digging into Matts’ code. I made a version of my own. I detect keywords and playback time using the same Artyom library, but instead of images, I use prerecorded clips.

I started off by recording a few clips of myself, I needed a background video so I recorded a 5 second clip of myself just watching the screen. Then I recorded my answers to some of those questions above.

I then tried to find a virtual camera solution that would let me play these clips. I ultimately settled on OBS; however setting that up ended up being a little more work then I thought — luckily johnboiles had me covered with a Plugin for OBS that let me use it as a virtual camera. I then added all the clips I recorded above into OBS.

Next I created a simple webpage that would listen for audio from Google Hangouts or Zoom and would tell OBS what scene what to play based on the questions I was asked during the call using an obs websocket library

After some re-records and code suggestions from Matt himself! I called a friend (Thanks Taylor Tabb!), changed my source to the virtual camera feed and started recording!