Skip to content

Record your Canvas with Web Components

Posted on:May 30, 2023 at 07:00 PM

Am I weird? I use the <canvas> tag so much in my work. Whether its drawing 2D graphics or using full-blown 3D engines like Three.js or Babylon.

Whatever you use the canvas for, you’re escaping the normal document object model and rendering your pixels by hand (or triangles if you’re going 3D).

Using the canvas can be a ton of fun. You can just draw and draw and draw. Or it can be infuriating, especially when your shader code can’t compile. Don’t believe me? Check out my latest attempt trying to get ChatGPT to make shaders for me. In fact, I was having ChatGPT (both successfully and unsuccessfully) make several shaders I’d use for various video interviews.

But when you have something working, the results can be downright compelling! Whether you planned every pixel perfectly, or you created some happy accidents that animate gorgeously.

Obviously, if you want to share your work, you’ll be sharing a web page, right?

Well not anymore! I’d like to introduce you to the recordable-canvas Web Component. To install, simply:

npm i recordable-canvas

I have plenty of documentation and a few code examples in the repo, so I won’t go into how to use it here. Despite how simple of a component it is, it might be pretty interesting, especially given that it uses the brand new Web Codecs API to create the video.

Wrapping the Canvas

Before we get into video encoding itself, let’s talk quickly about how this component deals with the canvas. Really, the Web Component is just a wrapper. We put a normal canvas element inside a Web Component. It’s a shame though! I remember early on in the Web Component days we were hoping to extend any element in your browser. Want to add extra features to a Video element? Just say class MyComponent extends HTMLVideoElement.

Sadly that’s not the case. Today, we can only extend HTMLElement which acts like a normal <div>. But that’s OK, we’ll just wrap that <canvas> tag.

Fortunately, canvas doesn’t have that big of an API. Most canvas usage is creating a 2D or WebGL context, and then using that context to draw into. Also, the only canvas specific attributes are width and height. So this is all super easy to work with and create a thin layer around that we can bake into a Web Component API.

Now, what about the recording part?

Encoding Video with the Web Codecs API

Even though I’ll be using the new and snazzy Web Codecs API to record the video, it’s been possible to record many things before including the canvas, and natively in fact (meaning without extra libraries).

The catch: no MP4!

I’m thinking of the MediaRecorder API. It’s actually really confusing. I had to triple check I’m telling the truth as I write this post. Refer to this handy article for recording your canvas. You can specify the MIME type of the MediaRecorder and the outgoing data blob. But the problem is that just because you specify the MIME type, doesn’t mean the browser will support it! This is likely due to licensing and legal issues since MPEG has traditionally been patent incumbered in the past. Either way, the open source WebM format is generally supported, so that’s typically what folks will use to record.

Though it is rather inconvenient! WebM isn’t supported everywhere you’d want to use video, so prepare yourself to convert it to MP4 or similar if you want to drop it in your favorite editor.

There are solutions to transcode to MP4 live in your browser, such as with FFMPEG.js. MP4 is actually just the containing file however! The codec that you record with still needs to be supported by MP4. So prepare for FFMPEG.js to do lots of work if you go this route!

Luckily, we can record our canvas with MP4 friendly codecs using the VideoEncoder object provided by the new Web Codecs API.

Let’s start by instantiating and configuring our VideoEncoder:

const myencoder = new VideoEncoder({
    output: onEncoderOutput.bind(this),
    error: onEncoderError.bind(this)
});

await myencoder.configure(cfg);

First, we create our encoder. The first parameter lets us specify functions to handle encoded output and errors. It operates like this for a reason. You’ll be adding frames to the encoder, but when it has enough frames for an EncodedVideoChunk, you’ll see these come through in the output. It operates fast and off the main thread of your Javascript. Really it’s calling out to your OS level system encoders. It’s why you may want to use the VideoEncoder.isConfigSupported(config) method to test out whether your specific encoding configuration is supported first. Otherwise, you’ll hit the error handler you specified, and it’ll give you a SUPER vague error about how the encoder failed.

With the VideoEncoder instantiated, we need to configure it. The configuration object specifies a few things like width, height and codec. Here’s a sample:

{
    codec : 'avc1.42001E',
    width: 640,
    height: 480,
    hardwareAcceleration:"prefer-hardware",
    avc:{format:"avc"}
}

This lets the encoder know that I want to encode my video with AVC Level 3. I also want to use a GPU if possible. AVC is a codec that can be stuck inside an MP4 container, so we’re good to go there once we reach the point of packaging up a file.

How do we know its Level 3? And what does that mean? Well AVC iterates and provides new capabilities for each iteration. I chose level 3 because in my project I wanted to capture 1920x1080 sizes, and I believe level 3 is the minimum version that supports this. The codec string of 42001E specifies this. Don’t worry, I don’t know that it makes much sense either. I just end up using this handy table.

Once the encoder is configured (and no it’s not synchronous, you do have to wait for it), you can start encoding!

Encoding the Canvas

So with our encoder ready to go, can we just start throwing canvases at it? Not quite. First we need to create a VideoFrame to pass to it.

const frame = new VideoFrame(canvas, { duration: 0, timestamp: 0 } );

THIS is where we toss our canvas in! We’re taking the current state of the canvas at the time and making a VideoFrame from it.

You’ll probably want to specify a duration and timestamp while you’re at it. It’s an optional parameter, this object, but for sanity’s sake it’s best practice to use it in my opinion. From what I’ve seen, you can think of frame duration and time as metadata. It’ll help write the file, determine framerate, determine file duration, and things like that. Honestly, duration doesn’t seem to affect anything whatsoever that I’ve seen, but can be handy if you load up the frame again and want the duration for your own reference.

Even more advice - I’ve seen timestamp numbers typically expressed in microseconds. But no, I haven’t played around to see how much this matters in the actual video file. It’s probably going to have to do with how you play back the frames yourself, or what the library does to encode the file, or how an external player deals with those timestamps. Either way, microseconds seem pretty standard. For clarity, this means 1 second is 1,000,000 microseconds!

Anyway, once you’re ready to pass that frame on to the encoder, just do the following: myencoder.encode(frame, { keyFrame: true/false });

Simple right? Maybe not when you see that keyframe property! This object parameter is optional as well, but the problem is that your video is not going to play very well unless you use it. By default, if you don’t pass this parameter in, the keyframe property defaults to false.

You can think of keyframes as the complete image of your video. But what happens if every frame is a keyframe? This makes your file pretty darn big! Instead, if you pass false (or nothing), you’re creating a “delta” frame. Delta frames only contain information about what changed, which can save lots of filesize.

Think of a person talking against a static background. If the background doesn’t change, why save all those pixels over and over again?

I don’t specifically know what the best distance between keyframes is, but when I was exploring Web Codecs, I noticed that the infamous open source movie “Big Buck Bunny” had 1 second spacing between every keyframe, so I’ve been defaulting to that. It likely really depends on how much motion your video contains and how concerned you are about compression.

Handling our Chunks

Once you pass your frame to the encoder, you’re basically just trusting that it’ll come back via the handler you specified when you set up your VideoEncoder.

Here’s my recordable-canvas callback, complete with Typescript definitions for the parameters:

protected onEncoderOutput(chunk: EncodedVideoChunk, chunkMetadata: EncodedVideoChunkMetadata) {
    this.recording.push(chunk);
    if (chunkMetadata?.decoderConfig?.description) {
        this.description = chunkMetadata.decoderConfig.description;
    }
}

Notice, we get a EncodedVideoChunk and EncodedVideoChunkMetadata. For our purposes, the metadata isn’t so useful until the very end when we want to save the file. For this we need the “description”, and honestly, this to me is an incomprehensible data buffer that gets passed to our MP4 library.

All that said, the metadata DOES contain information like width, height, colorspace info, etc if you have a use for it! We don’t, because we’re just passing this info straight through, and we already know the width and height because we started from the beginning with this info as we set up our canvas.

Now, the EncodedVideoChunk can contain multiple video frames all compressed and bundled as one thing. Though, in practice, I’m not sure that I’ve seen more than one frame in any given chunk.

Either way, this is our “video stream”! I’m just pushing all of these chunks into an array. Fortunately, they do come back in the order you feed them to your encoder (phew!).

What’s cool is that these are encoded and compressed video frames, but they could be decoded by using the handy VideoDecoder from the Web Codecs API. In fact, they can be decoded in any order you want! Hello video editing!

Writing our MP4 File

Yeah, using the Web Codecs API is complicated. There’s lots of stuff to know about video to understand what you’re doing here. But low level APIs like this have a lot of power to do some pretty crazy things!

Here’s another video term you might not know: Muxing. Muxing basically means you’re combining video and audio tracks and putting them into a container file (such as MP4). Likewise demuxing is doing the reverse…start with a container and get separate video and audio tracks.

Unfortunately, the browser does not provide a way to mux/demux video. That said there is some discussion around that because the Web Codecs API is amazing, but you can’t really do much with it unless you have muxing/demuxing available to you.

Luckily there are loads of great libraries that can do this for us. One such library is MP4Box.js. MP4Box.js is a JS port of an incredibly comprehensive toolset of MP4 utilities.

As these things usually go though, its a CommonJS project. Sigh. I won’t get on my soapbox to say ES modules are no longer the future but the present, and we should all ditch CommonJS. Instead, I’ll just tell you that I pre-bundled the library with Rollup in the recordable-canvas component, so we can use it as an ES module. We did this for Tensorflow.js on Web Components in Space before. But all it means is that I bundled it with Rollup, so it becomes yet another source file in our project. This means that end users who want to use recordable-canvas can still work with the original source files as ES modules without having to worry about front-end tooling setups themselves.

Onto saving! I did gloss over creating the file initially with MP4Box.js when we started recording. But that’s only because its so easy: file = MP4Box.createFile();

But when we stop recording, here’s the recordable-canvas code for these last steps:

public async stopRecording(saveas?: string) {
        const oneSecondInMillisecond = 1000;
        const timescale = 1000;
        let durationInMillisecond = 1000;
        const fps = this.frameCount / (this._duration / 1000);
        let frameTimeInMillisecond = 1000 / fps;
        let totalFrames = Math.floor(durationInMillisecond / frameTimeInMillisecond) ;

        this._isRecording = false;
        this.encoder?.flush().then(() => {
            this.encoder?.close();

            this.recording.forEach((chunk: EncodedVideoChunk) => {
                let ab = new ArrayBuffer(chunk.byteLength);
                chunk.copyTo(ab);
                if (this.track === null) {
                    this.track = this.file.addTrack({
                        timescale: (oneSecondInMillisecond * timescale),
                        width: this.width,
                        height: this.height,
                        nb_samples: totalFrames,
                        avcDecoderConfigRecord: this.description });
                }
                this.file.addSample(this.track, ab, {
                    duration: (frameTimeInMillisecond * timescale),
                    dts: chunk.timestamp,
                    cts: chunk.timestamp,
                    is_sync: (chunk.type === 'key')
                });
            });
            if (saveas) {
                this.saveFile(saveas);
            }
        });
    }

    public saveFile(saveas: string) {
        if (this.file) {
            this.file.save(`${saveas}.mp4`);
            return true;
        } else {
            console.warn('Cannot save file because no file was created/recorded');
            return false;
        }
    }

The first step is to flush our encoder. Maybe we’ve stopped encoding canvas frames, but that doesn’t mean there aren’t any lingering frames in our encoder waiting for more to come so the set can be encoded and create a EncodedVideoChunk. So we wait for a flush, and while we’re waiting, more frames might be appended to our array as they come through via our encoder callback.

The next steps…well MP4Box has a bit of a complicated API. We’re adding samples for each EncodedVideoChunk in our array. We’re also making sure a video track is present to add those samples to. Basically our chunk is being converted into a format that’s compatible with MP4Box (keyframes, timestamps and all). It can be a little foreign to look at if you don’t know the underlying specifications of MP4 files (I certainly don’t).

After that, MP4Box has a nice little method to save the file. Just call file.save(myfilename.mp4) and you’re done!

Using recordable-canvas

I won’t get into much about using this component. I mentioned the documentation and demos in the repo, but the most basic usage is to set it up like this:

<recordable-canvas width="300" height="300"></recordable-canvas>
<script>
    import 'recordable-canvas';
    const canvas = document.body.querySelector('recordable-canvas');
    canvas.addEventListener('ready', () => { ...
</script>

Once the component is “ready”, start doing whatever canvas operations you like with it! But to start/stop recording, simply call startRecording and stopRecording on the element, and call encodeFrame whenever you want to take a snapshot to encode the video frame.

I should call out that this was a small and not so comprehensive project on my part. Again, I did it because I wanted to record some WebGL shaders that I asked ChatGPT to make and use them for my video series at webcomponents.space. That was a whole ordeal you can see here on YouTube :)

When I was browsing NPM for similar projects I do see that this one looks wayyyyy better! It’s not a Web Component, but it’s similar in what it does. Canvas-record seems to offer several different ways to encode your recordings including a WASM one which should be super speedy!

Whichever one you record with, my main goal here was to talk a little bit about Web Codecs and Web Components. Hope you found this useful!