Securing AI Agents: Redpanda’s Agentic Data Plane Solution

Trusting an AI agent to interpret its own access rights is like asking a fox to guard the hen house after reading it the sentry duty manual. As Tyler Akidau, Tyler Rockwood, Johannes Brüdderl, and Marc Millstone from Redpanda point out, modern business has hit a fundamental architectural dead end. We are trying to deploy "digital employees" that are thousands of times faster than humans, yet remain pathologically prone to hallucinations and susceptible to prompt injections.

The core of the problem lies in traditional systems based on Large Language Models (LLMs), where security policies are often baked directly into the context window or data stream. For an agent, these instructions are merely advisory: it can ignore them, misinterpret them, or erase them entirely under the pressure of a clever prompt. To transform agents from a security liability into a controlled tool, Redpanda proposes the concept of the Agentic Data Plane (ADP). The key idea is the use of out-of-band metadata—infrastructure channels that transmit security signals and context parallel to the main stream, making them completely invisible to the model itself.

Technological Breakthrough: Decoupling the Control Plane

Separating control logic from execution logic moves security from the shaky ground of "prompt engineering" into the realm of deterministic control. In such an architecture, an agent is physically incapable of seeing unauthorized data or expanding its privileges because it does not control the metadata layer. In practice, this turns the system into a rigid framework where AI actions are restricted at the transport protocol level, rather than by a polite request to "not do anything bad."

In the new architecture, security is enforced not by instructions given to the model, but by the very structure of data transmission to which the AI has no access.

The developers demonstrate the viability of this approach through a portfolio rebalancing system where groups of agents trade across isolated accounts:

Transaction limits are hard-coded into the transport channels. Specific client balance viewing rights cannot be modified by the agent. The agent can neither read the technical metadata nor bypass it.

This is the only way to prevent the cascading damage that AI can inflict at machine speeds before a human operator can even reach for the "kill switch." We are finally moving away from trying to "persuade" neural networks to be honest and toward creating an environment where they simply cannot be otherwise.

Source: arXiv cs.AI →

Rate this material

★ ★ ★ ★ ★

AI AgentsCybersecurityAI SafetyAI in BusinessRedpanda

Beyond Prompt Engineering: Redpanda’s Plan to Hard-Code AI Agent Security