Kai Ole Hartwig — Blog
16 min read
High
By

FFmpeg: an autonomous AI agent finds 21 zero-days (CVE-2026-39210–39218) — a 183-byte RTSP packet is enough for RCE

6 June 2026. Security vendor depthfirst has reported that its autonomous agent found 21 previously unknown zero-days in FFmpeg — in roughly 1.5M lines of C that, after two decades of fuzzing and audits, were considered heavily hardened; some already carry CVE numbers (CVE-2026-39210 to CVE-2026-39218), and individual bugs lay latent for 15 to 23 years. The sharpest finding is a heap overflow in the AV1-RTP depacketizer where a single 183-byte packet, during an ordinary ffmpeg -i rtsp://… call, takes over the instruction pointer — no auth, no special flags. FFmpeg ships not only in the system package but in container images, Python wheels, media pipelines and appliances: if you process untrusted RTSP/AV1, check today — and patch the embedded copies too.

TL;DR — 90 seconds

Affected?

Any environment that runs FFmpeg against attacker-influenced input — especially media ingest that pulls user-supplied stream URLs, transcoding services, CCTV/surveillance clients over RTSP, and the countless embedded FFmpeg copies in container images, Python wheels and appliances (not just the system package).

Risk?

Heap/stack overflows in parsers and demuxers (TS demuxer through VP9 decoder). At least one finding (AV1-RTP depacketizer) has been developed by depthfirst into an RCE primitive with a controlled instruction pointer — network-reachable, no auth, a 183-byte packet.

Immediate action?

Apply the upstream fix or your distribution's security update as soon as available; pull the embedded FFmpeg copies in images/wheels/appliances along too; prioritise anything that ingests untrusted RTSP or AV1-over-RTP.

Recommendation?

Mid-market and enterprise: treat media processing as an untrusted parsing surface — sandbox it, restrict it with demuxer/protocol whitelists, search the SBOM for ffmpeg/libav*, and treat CVE-bearing dependency bumps as security work, not routine maintenance.

Criticality?

high (references the hero badge — documented RCE primitive with PoC, network-reachable; review within the 48h window).

What is the problem?

FFmpeg is one of the most widely deployed pieces of software in the world: from the browser to streaming infrastructure, it quietly processes media everywhere. That is exactly what makes it a target for zero-click attacks — it routinely parses complex, untrusted media. And it is large: roughly 1.5M lines of heavily optimised C for hundreds of formats, including more than two decades of fuzzing and manual audits. Google's Big Sleep team had recently reported 13 flaws, then Anthropic's Mythos model pulled out more. The assumption was: the low-hanging fruit is picked.

And yet depthfirst's autonomous security agent produced 21 confirmed zero-days in a single run — each with a reproducible PoC input, for roughly USD 1,000 in compute (per the report, about 10% of what Anthropic spent with Mythos). The difference from a coding agent is methodological: a security agent first threat-models the codebase, identifies exposed parsers and protocol handlers, follows data flows to the sink — and verifies by execution rather than warning theoretically. That is exactly what separates a real finding from “CVE slop”.

Some of the findings already carry CVE numbers: depthfirst lists the consecutive identifiers CVE-2026-39210 to CVE-2026-39218 (the source's prose says “eight” assigned CVEs while the enumeration shows nine consecutive IDs — a small inconsistency in the source that we flag here rather than resolve), the rest are fixed but not yet numbered (depthfirst tracks them as DFVULN-1xx). The age range is the genuinely uncomfortable part: a stack overflow in the SDT code (CVE-2026-39214) dates to the original 2003 implementation and sat undetected for 23 years; an MPEG-4-AAC-RTP finding (DFVULN-122) reaches back to 2005. Heavily audited does not mean bug-free — it only means the remaining bugs sit deeper than human reviews and classic fuzzing reach in reasonable time.

Who is affected?

AffectedNot affectedConditions / aggravating
Services that run FFmpeg against attacker-influenced media/URLs (media ingest, transcoding, thumbnailing of user-supplied files)Environments that use FFmpeg exclusively against fully trusted, own mediaThe vulnerable paths sit in demuxers/decoders/depacketizers; it is enough for attacker-controlled input to reach the sink
RTSP clients and anything pulling AV1-over-RTP (CCTV/surveillance, stream relays)Builds without RTSP/RTP protocol support, or with the affected demuxers disabledCVE-2026-39210 / DFVULN-127: reachable via ffmpeg -i rtsp://attacker/stream, a 183-byte packet, no auth, no interaction beyond opening
Embedded FFmpeg copies: container images, Python wheels (pyav, imageio-ffmpeg), Node bindings, desktop/mobile apps, appliancesHosts whose only FFmpeg instance is the freshly patched system packageapt upgrade does not patch the embedded copies; every bundled binary needs its own patch/rebuild
Pipelines that allow older format/protocol pathsPipelines with a tight demuxer/protocol whitelist limited to exactly the needed formatBugs span the TS demuxer, swscale, yuv4mpeg, VP9, DASH, RTP (AV1/JPEG/LATM/MPEG-4), RTMP, RTSP server, AVI/CAF/AVIF

Important for severity assessment: depthfirst developed one of the 21 findings into a working RCE primitive and published the PoC. The others are documented and fixed as heap/stack overflows or integer over-/underflows; whether each one leads to RCE is not fully worked out for all of them. The operating assumption still holds: memory corruption in a network-reachable parser must be taken seriously, not relativised.

Impact

The core is the reach of the AV1-RTP finding (libavformat/rtpdec_av1.c). AV1 organises its bitstream into OBUs; the RTP payload splits them across packets, the depacketizer reassembles them. One special OBU type, the Temporal Delimiter (TD), is per the spec to be “ignored and removed”. That is exactly where it breaks: when skipping a TD, the code advances the write cursor pktpos by the attacker-declaredobu_size — without allocating matching memory — and does not advance the input pointer buf_ptr. Result: the write cursor points past the allocation, and because the input pointer stays put, the TD bytes are re-interpreted as a fresh OBU on the next iteration — offset and content fully controlled.

This is a heap overflow with a controlled offset and controlled content — one of the strongest primitives memory corruption offers. depthfirst aims it at the AVBuffer.free function-pointer struct that FFmpeg's own allocator places right after the data buffer: with obu_size = 148 the writes land at pkt->data[148], the free pointer sits at offset 152, and refcount at 144–147 stays untouched at 1. A third embedded fabricated OBU forces a reallocation, the old (corrupted) buffer is freed — and calls the overwritten free pointer. On a release build a single 183-byte RTP packet is enough, and the instruction pointer is set to the attacker's value.

This is reachable in the ordinary RTSP PLAY flow — media ingest with user-supplied stream URLs, CCTV/surveillance, transcoding of remote AV1-over-RTP sources. No auth, no user interaction beyond opening the stream, no unusual flags. The source gives no CVSS number for this finding; the operational severity (“remote, unauthenticated, controlled IP”) justifies the high rating regardless of the score. The other findings broaden the attack surface across practically all common demuxers/protocols.

Mitigation / immediate steps

Note: the following steps are my operational recommendation based on the findings documented by depthfirst and general FFmpeg hardening practice — not a vendor-certified guide. The authoritative patch list/versions are to be checked upstream or with your distribution.

Operational Decision Block

Step 1 — Patch, including embedded copies

 

# update the system package as soon as the distribution ships the fix
apt-get update && apt-get install --only-upgrade ffmpeg libavformat* libavcodec*  # Debian/Ubuntu
# Wolfi/Alpine/container:
apk upgrade ffmpeg

# IMPORTANT: find the embedded copies apt/apk does NOT reach
pip show av imageio-ffmpeg 2>/dev/null
find / -name 'ffmpeg' -type f 2>/dev/null
find / -name 'libavcodec*' -o -name 'libavformat*' 2>/dev/null

 

Step 2 — Shrink the attack surface (until the fix is everywhere)

 

# allow only the protocols/demuxers you need (example: hard-restrict RTSP/RTP)
ffmpeg -protocol_whitelist file,crypto,data -i input.mp4 ...   # no rtsp/rtp/http

# where AV1-over-RTP is not needed: block RTSP/RTP ingest at the edge

 

Step 3 — Isolate untrusted processing

 

Treat media parsing as untrusted code:
- FFmpeg worker in its own, minimally privileged container/namespace
- seccomp/AppArmor profile, no network access for pure file transcodes
- resource limits (memory/time) against DoS findings (e.g. AVI LIST underflow)
- for RTSP ingest: only allowed, validated source hosts; no user-free URLs

Detection / verification

Check points derived directly from the documented findings and common FFmpeg inventory practice.

FFmpeg inventory & version (incl. embedded)

 

# collect the versions of all reachable FFmpeg instances
ffmpeg -version | head -n1
# search the SBOM/image for ffmpeg/libav (example with syft)
syft packages dir:/ -o table 2>/dev/null | grep -iE 'ffmpeg|libav(codec|format|util)'
# scan container images, not just the host
grype <image-ref> 2>/dev/null | grep -iE 'ffmpeg|libav'

 

Find exposed protocol paths

 

# calls that pass untrusted network protocols to FFmpeg (RTSP/RTP/HTTP)
grep -RInE "rtsp://|rtp://|-i\s+\$\{?[A-Za-z_].*url" --include=*.{sh,py,js,ts,go,php} .

# missing a protocol whitelist? (no -protocol_whitelist set)
grep -RInE "ffmpeg" --include=*.{sh,py,js,ts,go,php} . | grep -v "protocol_whitelist"

 

Runtime indicators

 

- crashes/restarts of FFmpeg workers on RTSP/AV1 ingest (possible overflow triggers)
- segfaults in libavformat/libavcodec in dmesg/coredumpctl
- unusual child processes of an FFmpeg worker (hint of post-RCE activity)
- memory/time spikes on manipulated containers (DoS findings like AVI LIST/CAF)

Operator guidance

Mid-market

Inventory first, then patch — and not just the system package. Most mid-market stacks have FFmpeg in at least three places: as a distribution package, as a Python wheel in a background worker, and bundled in some app or appliance. apt upgrade only catches the first. Where a service thumbnails file uploads or transcodes media, treat it as untrusted parsing: isolate the worker, limit network and resources, and set a protocol whitelist until the fix is everywhere. If you run no RTSP/AV1 ingest you have room on the sharpest finding — but the demuxer findings also affect perfectly ordinary file formats.

Enterprise

Additionally: SBOM-driven search for ffmpeg/libav* across all images and artifacts, not just hosts. Treat CVE-bearing dependency bumps as security work with their own SLA, not routine maintenance — the source puts it well: finding bugs has become cheap, triaging, fixing and rolling out has not. For network-reachable media-ingest paths, dedicated detection pays off (worker-crash telemetry, child-process anomalies) along with a seccomp/AppArmor sandbox so a successful overflow does not turn directly into lateral movement.

Kubernetes / containers

Rebuild images as soon as the fix is in the base — and check whether FFmpeg arrives via the base image or via an app layer (wheel/Node binding/static binary); often both. FFmpeg transcoding pods belong in their own minimally privileged namespace with seccompProfile, without outbound network access for pure file jobs and with hard memory/CPU limits against the DoS findings. Where RTSP ingest runs, restrict the allowed source hosts via NetworkPolicy.

Declarative stacks (NixOS / Talos / Flatcar)

The advantage here is provability: which FFmpeg version with which enabled demuxers/protocols went into which build is documented — and the rebuild onto the patched version is reproducible rather than manual. Anyone who builds FFmpeg with a minimised feature matrix anyway (only the needed demuxers/protocols) already has a structurally smaller attack surface. That is the lever here: a build without RTSP/RTP support is simply not reachable for the sharpest finding.

What I actually did

I treat FFmpeg in managed stacks as what it is: a network-/file-reachable parsing surface for untrusted input. On this occasion that meant: pulled an FFmpeg inventory across all images, wheels and appliances (not just system packages), checked call paths for untrusted RTSP/RTP/HTTP, added missing -protocol_whitelist, reviewed transcoding workers for sandbox/seccomp and resource limits, and scheduled the distribution/upstream fixes as a prioritised bump as soon as they ship. Where no AV1/RTSP ingest is needed, the corresponding protocol path is off anyway.

The honest lesson sits one level higher and connects to my ongoing AI-security line. In my IronWorm post, “Claude” was the attacker's disguise (forged commit authors); here autonomous agents deliver on the defender's side — Big Sleep, Anthropic's Mythos / Project Glasswing, and now depthfirst — reproducible findings in exactly the code two decades of fuzzing did not exhaust. The same technology, both sides of the wall. For operators the bottleneck shifts visibly: finding has dropped to ~USD 1,000 per run, the backlog now sits in deploying. Anyone still running patch cycles, auto-update and CVE bumps as “routine maintenance” falls behind a discovery pace that is no longer humanly timed.

A closer look: why an “ignored” packet becomes RCE

The AV1-RTP finding is instructive because the bug does not sit in “malicious” code but in a harmless-sounding spec instruction: “ignore and remove the Temporal Delimiter.” The implementation does this half-heartedly — it advances the write cursor by the declared size (as if it had written something) but allocates nothing and does not consume the input bytes. From this single continue two defects fall out: a poisoned write cursor and a non-advanced read pointer that reads the same bytes again as a length field. The elegance of the PoC lies in the arithmetic: the values are chosen so that the overflow hits exactly the free function pointer of the adjacent AVBuffer struct, while the refcount just below stays untouched at 1 — otherwise the free would never fire. This is no lucky hit but engineering precisely tuned to FFmpeg's own allocation layout. Lesson: for untrusted parsers it is not only “what gets written” that matters, but also “does the cursor still match the allocation” — an invariant a human review across 1.5M lines easily misses and a targeted agent checks deliberately.

Frequently asked questions about the FFmpeg zero-days

Is apt upgrade / apk upgrade enough to close the FFmpeg zero-days?+

Only for the system package. FFmpeg ships in countless embedded copies — container images, Python wheels (av, imageio-ffmpeg), Node bindings, statically linked app binaries and appliances. The package manager does not reach these; they need their own patch or rebuild. Practical check: find / -name 'libavcodec*' -o -name 'ffmpeg' plus an SBOM scan of the images.

Are my FFmpeg containers network-attackable via CVE-2026-39210 / the AV1-RTP bug?+

Only if a path passes untrusted RTSP or AV1-over-RTP to FFmpeg — the documented RCE PoC runs via ffmpeg -i rtsp://attacker/stream, a single 183-byte packet, no auth. Pure file transcodes without a network protocol are not reachable here. Until the fix lands: set -protocol_whitelist and let no attacker-controlled stream URLs reach the worker.

How many of the 21 findings really lead to remote code execution?+

depthfirst developed one finding (the AV1-RTP depacketizer) into a working RCE primitive with a controlled instruction pointer and published the PoC. The others are documented and fixed as heap/stack overflows or integer over-/underflows; whether each leads to RCE is not fully worked out for all. Operationally: take memory corruption in a network-reachable parser seriously, don't relativise it.

Are there CVE numbers and a patch for the FFmpeg findings?+

Some of the findings already carry numbers: depthfirst lists the consecutive identifiers CVE-2026-39210 to CVE-2026-39218 (the prose says “eight” assigned CVEs while the enumeration shows nine consecutive IDs — a small inconsistency in the source). The rest are fixed but not yet numbered (depthfirst-internal IDs DFVULN-1xx). The authoritative versions come via FFmpeg upstream or your distribution — reconcile the patch list there, don't derive a fixed version number from this post.

Why does an AI find bugs that 20 years of fuzzing missed?+

A security agent works differently from fuzzing: it threat-models the codebase, follows data flows from attacker-controlled input to the sink, and verifies via an executed PoC instead of blindly mutating inputs. This reaches paths fuzzing rarely hits (e.g. rare protocol states). Several agents (Big Sleep, Anthropic Mythos / Project Glasswing, depthfirst) delivered in FFmpeg independently — the findings are reproducible, not “CVE slop”.

Do Kubernetes transcoding pods need rebuilding because of the FFmpeg zero-days?+

Yes, as soon as the fix is in the base — and check whether FFmpeg arrives via the base image or an app layer (wheel/binding/static binary), often both. Additionally put transcoding pods in their own minimally privileged namespace with seccompProfile, without outbound network access for file jobs, with hard memory/CPU limits against the DoS findings, and restrict RTSP sources via NetworkPolicy.

Conclusion

In the bug class the FFmpeg findings are nothing new — heap/stack overflows in demuxers are familiar to anyone parsing untrusted media. Three things are new: that a third autonomous agent delivers independently in heavily audited code, that the cost has dropped to ~USD 1,000, and that at least one finding is a clean, network-reachable RCE primitive. For operators the message is unspectacular and precisely for that reason important: FFmpeg sits not only in the system package but in images, wheels and appliances — patch the embedded copies too, treat media parsing as an untrusted surface, and prioritise anything with untrusted RTSP/AV1. The authoritative version list belongs to upstream/distribution, not a snapshot. Don't dramatise — but inventory today, and take the patch backlog more seriously than the finding, because finding has just become cheap.

Sources

Before the next user-supplied stream takes over your worker — let's talk about your media pipeline.

I inventory, isolate and patch your FFmpeg/media processing — including the embedded copies that apt upgrade does not reach.

SBOM-driven FFmpeg/libav* inventory across hosts, container images and wheels, review of call paths for untrusted RTSP/RTP, protocol whitelisting as a stopgap, sandboxing of the transcoding workers (seccomp/AppArmor, namespace, resource limits) and a prioritised patch/rebuild in the maintenance window.

Platform operations instead of advice-on-paper: I check, mitigate and validate production media pipelines — from inventory through the stopgap measure to validation.

Book an appointment directly

About the author

[Translate to English:] Foto von Kai Ole Hartwig.

Kai Ole Hartwig

Freelance DevSecOps consultant · OnlyOle Consulting

Programming since 2002 – self-taught, set up my own business with KO-Web in 2012. Over 100 projects, with a focus on security, performance, automation and quality. Today freelance: DevSecOps consulting, training and software development.