Benchmarking Problems

What could go wrong when benchmarking so many codecs?

Containers

Containers are like boxes for packaging different audio/video/subtitles/metadata streams. They encapsulate stream data and allow to easily seek and synchronize streams.

In this benchmark I was trying to use container that would be most commonly used with that codec and at the same time I tried avoiding the mkv container as it is not supported on the web.
(I used webm instead which is a subset of mkv and is supported by major browsers)

There are also encoders that don't support outputing to a container.
For example xeve and vvenc are only able to output the codec bitstream.

The main problem with containers is that they require certain overhead. Codec bitstream alone will always result lower in file size than a bitstream packaged into a container with all the metadata. The difference might not be big but it's something to take into consideration.

Quality

The quality value passed to the encoder can mean different things depending on rate control mode used. I always tried to give encoders the best possible scenario to work with.

This means ideally I would use the crf mode. It reduces the quality on scenes which encoder assumes to be less demanding allowing to have overall lower file size. When using constant quality like qp the quality value is the same for every frame. (an article about it)

Some encoders however either don't support crf mode or simply I was unable to get it working properly.

Why not bitrate?

As I said above I'm currently running all of the encoders in either crf or qp rate control mode.

There are however a pretty common modes (cbr or abr) that allow targeting certain bitrate.
It seems like it would be the easiest path to equally benchmark all of the codecs. However I wanted the benchmark to be more like real life usage.

Normally, when encoding clips on your own, using constant bitrate for different video clips is not a good idea. Some more demanding clips require more bitrate while for others, additional bitrate would be a waste of bits. That's why it's common to use quality based presets when encoding.
Looking at the benchmark, user should also see which quality values are best to use for encoding different video clips.

Of course you can also see the resulting bitrate of each quality preset on graphs.

Visual comparison not usable

This site allows you to compare video codecs on different quality presets using a modified version of vivict. It is a great tool that enables detailed subjective video comparison inside a browser.

However not all codecs are supported by browsers.
I decided that the best way to overcome this and still provide authentic video quality would be encode videos of unsupported codecs using lossless vp9 (so called a lossless proxy).
It seems to be the best way to deliever lossless video on the web (lossless h264 does not play in Firefox, av1 uses a bit more cpu with no real benefit and ffv1 is not supported by any browser).

Distributing such videos over the web can be quite challenging. Lossless 1080p video can have bitrate ranging from 100 to 300 mbps. If you don't have very fast internet connection I would suggest downloading the lossless proxy clip either using torrent or directly. Then you can choose the downloaded file as source for one side of comparison. The clip will be played back from your local disk. It might not be as conventient but it's certainly an option.

You can also try to play the clip slowed down using vivict player speed controls.

If you still struggle to play back the clip you can reencode it in high quality using FFmpeg so that your cpu/gpu could play it in real time.