Video compression algorithms

Post by **Bird on a Fire** » Wed Nov 25, 2020 8:06 pm

So, I've just been recording a presentation for an online conference.

Did the usual: start a zoom call by myself (#foreveralone), share screen, record. All fine to the last minute, where I flubbed a line so re-did the conclusions slide.

File size: 25 mb.

Opened it in OpenShot, excised 30 seconds of garbage and re-exported to the same format. File size: 484 mb. And it's worse quality.

WTAF? I had similar results last time I edited a talk together on a fancy Mac program. Is zoom just magic or something?

yoss · Post by **yoss** » Thu Nov 26, 2020 7:02 pm

Hi, I waited a day just in case someone more knowledgable than me turned up. Failing that, my twopence worth is this:

Zoom uses an extremely efficient compression algorithm (CODEC) resulting in smallish files, but the act of editing would require the file to be decompressed; probably the entire file if it's relatively short, and then recompressed again.

I suspect your editing package uses nothing like the same efficiency of CODEC but this wouldn't be apparent to you as the file is packaged up in the same 'wrapper' so it would appear to be the same format. It would also be creating signal loss when it recompressed the file.

Again I speculate that one of the ways Zoom gets its efficiency is by using far fewer keyframes than a traditional format, as Zoom calls are much less likely to have to deal with sudden changes of scene than most videos. Thay'll also be throwing away much more audio than most - and audio takes up a surprisingly large portion of the stream.

Millennie Al · Post by **Millennie Al** » Fri Nov 27, 2020 5:56 am

Zoom isn't magic, but it probably knows more about the stream than a generic package.

First, check what codec is used before and after your change. I think Zoom uses VP8, but you may have accidentally transcoded it into a less efficient codec. That could make a big difference. I believe Zoom uses Opus for audio, which is very efficient.

Second, check how long it takes to save the modified file. Video coding is very CPU intensive and Zoom is probably optimised for a talking head against a static background, so will exploit the unchanged areas of the picture to save bits (this is just the standard way video codecs work). Note that if you are in a Zoom call for an hour, it can spend an hour of CPU time encoding your picture, while if you save an hour-long video file from a video editor, it would be totally unacceptable for it to take an hour! That means that the original coding is potentially far superior to a re-encoding done later. The software may have some option to spend more time to get better compression.

Zoom probably needs quite a high rate of keyframes, but that depends on several factors. Decoding can only start at a keyframe, as other frames are encoded as differences with respect to some other frames. For offline media (e.g. a file that you can play at any time) a simple decoder can only start at a keyframe, so you put enough of them in to ensure that whenyou skip forwards or back the decode can start reasonably close to the target point. A more sophistacated decoder can start at any arbitrary point by looking backwards for the previous keyframe and decoding from there, discarding the result until it gets to the desired point. In principle, offline media can be better compressed than online (where online media is such that you need to decode the data as it arrives), because keyframes can be in the future as well as the past. I'm not sure that this is exploited very much, and it's the wrong direction for what you are seeing.

Another need for keyframes is to allow resuming decode after packet loss has broken the chain of dependencies. There are three ways to cope with packet loss, and I don't know which Zoom uses:

Wait for the next keyframe - requires frequent keyframes as the picture is bad or freezes until one arrives
Request a keyframe - costs a round-trip to the source, which must be able to generate a keyframe on demand
Use an underlying protocol which guarantees delivery through re-transmission

I believe Zoom used to use TCP, which is the third option, so in principle only needed a keyframe at the very start of the stream. However, I have read that it now uses WebRTC, so would use the second option. Obviously a file does not suffer packet loss, so doesn't need any of them, so in principle a recording could eliminate keyframes that were needed during a call - though that would require re-encoding the stream which is expensive, so very unlikely.

I would be astonished if the audio content was bigger than the video - unless the picture is very small or blocky. Opus might take 80Kb/s for very good quality audio, while 80Kb/s would be a terrible picture in any codec.

individualmember · Post by **individualmember** » Fri Nov 27, 2020 8:47 am

In my world there are two kinds of video codecs, I-frame and GOP. I-frame encodes each frame individually in the video file, GOP codecs encode a Group Of Pictures very cleverly.
I-frame codecs are typically an order of magnitude bigger then GOP codecs. Ten times the size is typical, could easily be twice that.
We only edit in I-frame because regardless of what your software is capable of, decoding and (re)encoding with make your hardware grind to a halt, quickly in the context of doing a couple of hundred edits in a day. (This isn’t entirely true, some people absolutely refuse to edit any GOP codec but in my experience it’s Usually ok when you are in a codec where you know that the maximum number of frames in any GOP is 12 frames or 16 frames).
Unpacking one kind of codec and encoding in another codec introduces errors which make the picture quality worse (we call it concatenation).
And different encoding software can give very different subjective quality to you pictures for the same data rate, even in the same apparent codec.

So add this to what those above have said. My guess is that the software that you’re using to trim the video file is processing the video codec badly. Does it have any settings/preferences that can be adjusted? Generally there’s a trade off between speed and quality, and in what I use you can (mostly) adjust it to shift that balance.

Scrutable

Video compression algorithms

Video compression algorithms

Re: Video compression algorithms

Re: Video compression algorithms

Re: Video compression algorithms