19 December 2019 / Screen Recording

Chrome Tutorial: Recording The Screen With Both Microphone Audio AND System Sounds

A few years back, getUserMedia brought us direct access to the camera and microphone.

This year, getDisplayMedia brought us standards-compliant screen capture in Chrome 72+ and Firefox 66+.

Starting with Chrome 74 - released back in April 2019 - the user can also toggle capturing the system sounds when sharing their screen.

In this post, we will focus on how to merge system sounds with audio captured by a microphone into a single audio track, resulting in a screen recording with sound from both sources.

If you're interested more in the code for screen recording, Mozilla has a great Wiki page on Using the Screen Capture API, which I strongly recommend you go through.

Obviously, we will be using getUserMedia() to capture the microphone, getDisplayMedia() to capture the screen and system sounds, and the MediaStream Recording API to record everything together.

Capturing the microphone input

Getting the audio from the microphone is straightforward with getUserMedia():

navigator.mediaDevices.getUserMedia({audio:true}).then(function(micStream) {
    /* use the stream */
}).catch(function(err) {
    /* handle the error */
});

Capturing the screen with system sounds

This is just as easy. However, you need to be aware of how the feature works as it needs the user's action and consent.

The code is simple. You need to specify in the constraints object that we also want the audio with audio:true:

var constraints = { video: true, audio: true };
navigator.mediaDevices.getDisplayMedia(constraints).then(function(screenStream){
    /* use the stream */
}).catch(function(err) {
    /* handle the error */
});

The constraints object passed into getDisplayMedia() is a DisplayMediaStreamConstraints object. Through it, you can also specify particular constraints for both video and audio, like whether or not the cursor should show up in the recording. Some of the constraints are not yet fully compatible with screen capture and will be ignored. For example, from my experience, the audio volume constraint is currently ignored. I won't go into any further details because it goes beyond the scope of this post.

With the code in place, Chrome will present the user the option to share their audio. It is not a given. The user needs to toggle sharing system sounds on. Here's the UI:

The Chrome modal shown when sharing your screen with the option to share system audio highlighted — Chrome UI when attempting to share your screen (includes option to share system audio)

Also, capturing system sounds only works in a few cases:

On Chrome on Windows, you can share both system sounds when recording the entire screen and Chrome Tab sounds when recording just a tab
On Chrome on macOS and Linux, the user can only share the Chrome Tab sounds when recording a Chrome tab
Does not work yet on Firefox

Merging the microphone input with system sounds

We now need to combine the audio track from the screen stream that we got earlier with the audio track from the microphone stream.

The reason is that we need a single audio track in order to record it with MediaStream Recorder API .

But how can we do this? WebAudio API to the rescue! WebAudio can combine multiple audio streams and pipe them to a single destination/audio track. This is perfect for our use case.

The following code snippet demonstrates this. I've commented on each step for clarity:

//create new Audio Context
var context = new AudioContext();

//create new MediaStream destination. This is were our final stream will be.
var audioDestination = context.createMediaStreamDestination();

//get the audio from the screen stream
const systemSource = context.createMediaStreamSource(screenStream);

//set it's volume (from 0.1 to 1.0)
const systemGain = context.createGain();
systemGain.gain.value = 1.0;

//add it to the destination
systemSource.connect(systemGain).connect(audioDestination);

//check to see if we have a microphone stream and only then add it
if (micStream && micStream.getAudioTracks().length > 0) {
    //get the audio from the microphone stream
    const micSource = context.createMediaStreamSource(micStream);
    //set it's volume
    const micGain = context.createGain();
    micGain.gain.value = 1.0;
    //add it to the destination
    micSource.connect(micGain).connect(audioDestination);
}

Once this is done, we can get hold of the combined audio stream with the stream attribute of the MediaStream destination object: audioDestination.stream.

Putting it all together

The audio track from the audio stream we created earlier will be composed with the video track from the stream of the screen capture to create a MediaStream that will get passed on to the MediaRecorder so that it gets recorded.

I know that sounds rather complex, and that is why I've put together a CodePen that makes use of the code above and implements a fully working screen recording demo that captures system sounds and microphone audio. It is also embedded below.

le: we've updated our getDisplayMedia() demo and posted the code is this GitHub repo.

See the Pen Screen Recording With System Sounds Demo by Remus Negrota (@remusnegrota) on CodePen.