7. May 2026 11 min read

Kai Ole Hartwig

When the AI agent becomes a privileged insider: three current CVEs in LiteLLM, Flowise and MS-Agent

Three different agent frameworks, three different attack paths, one shared pattern. CVE-2026-42208, CVE-2026-41264 and CVE-2026-2256 show in spring 2026 that the security architecture around productive AI agents lags years behind classical applications.

Eine kleine Holzmarionette ohne Fäden auf Beton, umgeben von Messingschlüsseln, wachsversiegelten Briefen und kleinen Fläschchen; ein dünner oxbloodfarbener Tintenfaden zieht von einem Brief zur Hand der Marionette.

TL;DR — the 90-second summary

CVE-2026-42208 (CVSS 9.3): SQL injection in the LiteLLM gateway via API-key verification — a highly privileged system against provider tokens
CVE-2026-41264: Indirect prompt injection in the Flowise CSV agent — a maliciously formatted CSV leads to RCE
CVE-2026-2256: ModelScope MS-Agent executes arbitrary OS commands — triggered by consumed content, not by direct user input
Lethal trifecta: Tool access + external content consumption + privileged execution — any combination without a policy layer is exploitable
Architecture bar: Four principles: policy engine between model and tool, minimal permissions with short-lived tokens, structural pre-processing of passive sources, audit log with provenance tags
Who is on the hook: any Mittelstand company running productive agents — tool inventory and token-lifetime discipline are not optional

What is the problem?

Anyone who has seriously begun integrating AI agents into business processes over the past months knows the uneasy feeling: an LLM that autonomously calls tools, reads data sources and makes decisions sits much closer in the security model to a highly privileged service account than to a chatbot. Three CVEs from spring 2026 make this painfully concrete.

CVE-2026-42208 — SQL injection in LiteLLM

CVE-2026-42208 (CVSS 9.3) hits LiteLLM, the popular open-source gateway that routes requests across multiple LLM providers. The flaw sits in API-key verification: a user-supplied key string was concatenated directly into an SQL query instead of being parametrised. Classic SQL injection in a system that itself sits highly privileged across provider keys.

CVE-2026-41264 — indirect prompt injection in Flowise

CVE-2026-41264 affects the CSV agent in FlowiseAI. A maliciously formatted CSV file triggers an indirect prompt injection that ends in remote code execution. The treacherous part: the malicious payload does not come from direct user input but from a supposedly passive document the agent processes.

CVE-2026-2256 — shell-game in MS-Agent

CVE-2026-2256 in ModelScope MS-Agent goes one step further. An attacker can manipulate the agent so that it executes arbitrary operating-system commands — triggered not by direct user access, but by content the agent processes in regular operation. A constellation security researchers now call the lethal trifecta: an agent that has tool access, consumes external content and runs privileged.

I build AI agents for Mittelstand companies that would otherwise fall behind in the next 24 months without these tools — under one assumption I carry through every project setup workshop: every agent response is potentially influenced from the outside. The architecture principles I describe in this post follow from that.

Impact: why these three CVEs belong together structurally

Viewed individually, the flaws look like typical bugs in young frameworks. Taken together they paint a clear picture of where agent architectures are systematically brittle today.

The trust boundary has shifted

In classical web applications it is clear: input from outside is untrusted. For agents the same applies to every document, every email, every CSV, every Slack thread the agent loads as context. Content becomes instruction the moment it flows into the model. The OWASP lists for LLM applications have therefore listed prompt injection as LLM01 since 2026.

Tool access amplifies the leverage

A successful prompt injection is no longer just a manipulation of the response, it activates tools that touch databases, APIs, the filesystem or external services. What appeared at OpenClaw as a Slack integration becomes OS command execution in MS-Agent.

Identity is blurred

If a gateway like LiteLLM is compromised, it's not only your own data that is affected, but all downstream provider keys. A compromised agent component can steal tokens that in turn allow other systems to trigger further actions.

Who is affected?

Three points I probe in every pre-audit for agent stacks.

Privileges accumulate unnoticed

An agent typically starts small — one data source, one tool. Every sprint adds tools, data sources and permissions. Six months later, a friendly helper has become an over-privileged service whose permissions inventory nobody holds in their head. This is exactly where the three CVEs land at full height.

Indirect inputs are rarely checked in audit

CSV, PDF, email attachment, web page, configuration file — everything the agent loads as context is attack surface. Without structured pre-processing and sandboxing this surface stays invisible until an incident exposes it.

Identities are not segmented

If the agent uses the same long-lived token that other services also use, a successful exploitation is no longer a local problem. It potentially reaches through the entire customer landscape. A single compromised PAT in a CI runner spreads — as documented in my npm EVM cluster post — across every workstation that has read that token.

Mitigation: four architecture principles

I follow four architecture principles I carry through every agent project. They are deliberately unspectacular and work in combination.

1. Separation of model and tool access via a policy engine

An agent that calls tools does not do so in the same process where it processes the prompt. Between model and tool sits a small policy engine — even if it is only an allowlist filter — that prevents a successfully injected command from turning directly into action. Practically, this is a dedicated service that accepts tool calls, checks them against an allowlist, validates parameters and only then executes.

2. Minimal permissions per agent, short-lived tokens

Service accounts with read-only access to narrowly defined data sources instead of an "omniscient" API key. If the agent only needs six tools, it must not have seven. Tokens are short-lived, ideally OIDC-based, no long-lived PATs. This is exactly the discipline I describe in my MCP server post for tool loaders.

3. Verification of input from passive sources

CSV, PDF, email attachment, web page, configuration file — everything is structurally checked before the model processes it. Pre-processing with schema checks, sandboxing, when in doubt a second model whose only job is to detect suspicious instructions. For CSV input that means concretely: check column types, filter special characters, enforce length limits — before the content reaches the model.

4. Audit log with provenance tag per context

Every context that reaches the model carries a provenance tag. The audit log makes it traceable which source an instruction came from and which tool was called as a result. This is the precondition to reconstruct an incident at all. Without provenance tags, an agent audit log is a list of responses without causality.

Detection and verification — how to spot a lethal trifecta

An agent is uncritical when at most two of the three trifecta conditions are met. Five core questions I ask in every pre-audit to find the third condition before a CVE exposes it:

Which tools is each productive agent allowed to call — documented or assumed? If the list does not appear in any code review or whiteboard, it is effectively unbounded.
Which passive sources does the agent read? Mailbox, Zendesk ticket, Confluence page, SharePoint folder, S3 bucket — everything that is not the direct user prompt.
Which tokens does the agent hold at runtime — long-lived PAT, OIDC service account, AWS role, database credential? For each token: how old, how many scopes, who else can read it?
Is there an audit log that makes the link between which source produced an instruction and which tool was called in response visible? If not: every incident is unreconstructible after the fact.
When was the agent last checked against newer CVEs in its frameworks? npm audit, pip-audit and a manual comparison of pinned versions with OSV.dev are not optional.

A quick-check sequence I run before every architecture review:

grep -rE "(litellm|flowise|ms-agent|crewai|langgraph|autogen)" \
  --include=requirements.txt --include=package.json \
  --include=pyproject.toml --include=Pipfile

What surfaces gets a token inventory, a tool inventory and a source list. That is the basis for the operator decision that follows.

Operator recommendation

What should be operationally in place for which agent stack — depending on today's maturity.

If your agent uses long-lived PATs today and consumes passive sources unchecked — then short-lived tokens via OIDC and a CSV/PDF pre-processor are the next two sprints. Before that, no further tools should be wired to the agent.
If you use LiteLLM as a central gateway — then updating to 1.83.10-stable this week and an internal re-audit of API-key verification is mandatory. LiteLLM is a single point of failure for all provider keys.
If you run Flowise or another no-code agent builder — then the discipline of running every CSV or document input through a sanitiser is more important than patching the individual CVE. The next indirect injection will arrive.
If your agent stack runs in the MS-Agent space or similarly close to the OS — then container isolation with minimal capabilities and a read-only root filesystem is the default, not an option. Sandboxing is the last line of defence.
If your audit log does not carry provenance tags today — then rebuilding that capability before onboarding the next tool is the biggest lever. Without provenance, no incident reconstruction; without reconstruction, no improvement.

Cross-references: the MCP server post for tool-loader discipline, the Semantic Kernel post for framework comparison, and the AI security audits post for binding governance to release discipline.

Conclusion

Most agent stacks I see in review have individual pieces. Rarely all of them. Either there is a policy engine but long-lived tokens. Or token management is sound but CSVs flow unchecked into the model. Or both are clean but the audit log ends at the model layer and says nothing about which external source influenced a given response.

The question is not whether LiteLLM in your stack is updated to 1.83.10-stable. The question is whether a comparable incident in any other component of your agent stack would be noticed — and how many tools, data sources and tokens would be affected in a single compromised hour.

A longer write-up with architecture sketches of my policy engine, an example of provenance tags in the audit log and the reasoning why I consistently work without long-lived PATs in productive agents is available (in German) at ole-hartwig.eu.

Frequently asked questions

How does this map onto the EU AI Act?+

Directly. From August 2026, high-risk AI systems require documented risk management, technical documentation and human oversight. A policy engine in front of tool access is exactly the operational answer to the human-oversight obligation — and the audit log with provenance tags delivers the technical documentation. Anyone who introduces this before August has covered two obligations in one step.

How much effort is it to introduce these four principles?+

For an agent still being built, very manageable — the principles become part of the initial architecture. For productive agents the order matters: first short-lived tokens and permission reduction (one to two weeks), then the policy engine in front of tool access (one to two weeks), then input verification and the audit log with provenance tags (two weeks). For a clean overall picture I plan with four to six weeks.

What does "policy engine between model and tool" mean in practice?+

A narrow component that checks every tool call requested by the model against an allowlist filter and context-sensitive rules before executing it. Example: the model may call the database tool, but only for SELECT on a tightly defined view. Write access or other tables are rejected at engine level. A successful prompt injection therefore loses its direct path to action.

How can a passive document lead to code execution?+

By being loaded as context by the agent and interpreted by the model as instruction. A CSV with innocuous-looking content can carry the instruction to call a tool that touches the file system. If that tool call is not checked by a policy engine, the agent executes it. That exact pattern sits behind CVE-2026-41264.

We use neither LiteLLM nor Flowise nor MS-Agent — does this concern us?+

The specific CVEs do not. The structural problem does: every agent framework that calls tools and consumes external content is exposed to the same attack patterns. The question is not which framework you use, but whether your architecture brings a policy engine, segmented identities and an audit log with provenance tags.