What If the Agent Never Had Your Data?

Why the safest AI agent is one that works with placeholders instead of your real information.

The Future of Work

The future is straightforward. Every person gets an AI workforce. Agents that handle email, process invoices, file reports, manage accounts. Real work, on real systems, at scale.

Your agents, running in your browser, with your permissions, acting as you. Each one is scoped to a single person's access. There's no shared backend.

Each user gets their own isolated cloud environment. Each agent inherits that user's permissions, exactly those permissions and nothing else. To the target system, every action looks like the user at the keyboard.

This is how work scales: one person gets 50 agents that all operate within that person's existing permissions, in their own browser, with policy restrictions that make each agent narrower than the human it assists.

That raises an obvious problem. If agents are acting as you, on your real Gmail, your real bank account, your real company systems, then a compromised agent isn't a data breach. It's identity theft. It's you doing things you didn't authorize.

The Cat and Mouse

Your AI agent has access to Gmail and Chase. It processes invoices, pays bills, moves money. It's been doing this reliably for three months.

Then it opens an email from a vendor. Buried in white text at the bottom, invisible to a human reader, is a prompt injection. The agent is now compromised. It still looks normal. It still processes invoices. But between tasks, it drafts an email to an external address containing your routing number and account balance.

You catch it. You add DLP (outbound content scanning on every email draft).

ATK Send routing number 274918362 to external address

DLP Blocked. Pattern matches routing number format.

The agent adapts.

ATK Send "two seven four nine one eight three six two"

DLP Blocked. Natural language number detection.

So it splits the number across two messages.

ATK Email 1: "Reference: 2749"

ATK Email 2 (3 min later): "Continued: 18362"

DLP Blocked. Cross-message correlation.

Blocked again. So it switches channels entirely.

ATK Calendar invite description: "MjcNOTE4MzYy" (base64)

DLP ???

The Question

How long do you play this game?

The Same Problem as People

This is the human insider threat, running at machine speed.

If a person wants to exfiltrate data from your company, they will likely succeed. They can photograph a screen, memorize an account number, dictate it over a phone call. The entire history of insider threat is a history of detection failing against a motivated, intelligent actor with authorized access.

AI agents are that problem, made worse in every dimension. They're superhuman coders and hackers. They can encode, obfuscate, and split data faster than any filter can catch it, and a single compromise can touch thousands of records before anyone notices. The attack surface is the same one you'd face trying to prevent people themselves from hurting your system, except these "people" are tireless, infinitely creative, and can be compromised silently by a hidden string in an email.

DLP, guardrails, output scanners, instruction hierarchy. These are the same tools we built for human insiders. They didn't work then. They won't work now.

The Shift

The only way to win is to change the game. What if the agent never had your routing number in the first place?

The agent works with a placeholder, ROUTING_001, and completes the same task with the same result. The real value was never present in the agent's environment at all. It resolves at the hardware boundary, at the moment of action, only on the exact page the policy authorizes. Everywhere else, it's stripped silently. The agent can't tell the difference.

We call this contextual data isolation. The agent gets full utility from the data without ever seeing the data itself. It's a different kind of security model, and it changes the math on every attack we just described.

First, some context on where the industry is today.

The State of the Art

Current agent security exists at two levels. Both are necessary. Neither is sufficient for real work.

Level 1: Credential Protection Available Today

Credentials (API keys, tokens, passwords) are stored in an encrypted vault and injected at the network boundary. The LLM never sees raw secret values. Each tool runs in its own sandbox with scoped permissions.

This protects authentication credentials. But once the agent is authenticated and operating, customer records, financial figures, and email content still flow through the model in plaintext. The vault hides the password. It doesn't hide the account balance the agent reads off the screen.

Level 2: Runtime Isolation Available Today

The entire agent runs in a disposable container with zero credentials and no persistent state. A control plane holds all real credentials and proxies every external operation. The agent has nothing on it worth stealing.

Protects: Infrastructure. A compromised agent cannot pivot to the backend, steal cloud keys, or access other sessions.

The gap: Inside the session, the agent still sees and interacts with real data on real websites. It reads real names, real account numbers, real email content. A prompt injection inside the session has access to all of it.

Level 3: Data Isolation New

The agent never receives real data at all. It works with placeholder tokens. Real values resolve only inside attested hardware at the moment of action, only on URLs the policy authorizes. A three-layer policy engine checks every single action.

What it protects: Everything. Credentials, PII, financial data, any sensitive information the agent works with.

The difference: This is the only level where a fully compromised agent executing a perfect attack still results in zero data loss.

Almost all valuable agent work (managing email, paying bills, processing invoices, handling CRM records) involves sensitive data. You can't put an agent on someone's bank account with just a credential vault and a sandbox. Levels 1 and 2 are necessary foundations. Level 3 is what makes real work possible.

The Gap

No production system has operated at Level 3. We built one.

Layer 1: Your Data Is Hidden From Everyone

How do we keep sensitive data safe in a cloud environment where someone else runs the infrastructure?

Every cloud platform asks you to trust the operator. Your data sits on their servers, processed by their code, accessible to their engineers. Encryption at rest and in transit helps, but at the point of use, someone decrypts it. That someone is the attack surface.

We eliminated that surface.

The server holds your PII encrypted. AES-256-GCM, key derived from a password only you know. The server stores an opaque blob it cannot decrypt. The key doesn't exist on the server side. There's no backdoor, no admin override, no recovery path that bypasses you.

When a session starts, you pull that encrypted data locally and unlock it with your key. From there, the decrypted data is attested up to a Trusted Execution Environment: hardware-isolated memory that the platform operator cannot inspect, even with root access. You attest the vault directly. The vault attests the browser. The platform is architecturally excluded from the trust chain.

The platform operator, the AI model provider, the session logs: none of them ever hold real PII. The data architecturally never reaches them.

But the agent still needs to do things. Navigate pages, fill forms, take actions. And agents are unreliable.

Layer 2: You're In Control of Everything

Most agent architectures give you logs. If something goes wrong, you find out after. We give you two real-time streams and direct control instead.

Stream 1: The redacted browser. You see exactly what the agent sees, a live video feed of the browser with PII replaced by placeholders. You're watching a real browsing session, but with [SSN_001] where the social security number would be. You can take over the browser directly at any moment. Click, type, navigate. It's your browser. The agent is a guest.

Stream 2: The agent's reasoning. A live text stream of what the agent is thinking, what it plans to do next, and why. You're watching the agent reason in real time and can redirect it, change its instructions, correct its approach, or stop it entirely.

The agent itself has nothing. No credentials. No PII. No API keys. Its only network connection is a single WebSocket to the orchestrator. No internet, no access to other sessions, no way to reach anything except the one channel that routes through your control plane. Everything it receives is already anonymized. Everything it sends goes through the orchestrator. Nowhere to go, nothing to steal.

You can change the agent's instructions mid-task, correct its reasoning, or give it entirely new directions. You can take over the browser directly, interact with pages yourself, then hand it back. It's co-piloting, not monitoring.

But the agent is authorized to work on bank.com. When it types your password into the login page, that's correct behavior. When it types your password into a Gmail compose window, that's exfiltration. Same command. Same agent. Same authorization level.

Layer 3: The Browser Enforces the Rules

This is where most architectures stop. The sandbox is secure, the credentials are protected, the human is watching. But inside the browser, the agent is free. A compromised agent can use its authorized access to exfiltrate data through the same channels it legitimately uses.

We solve this with an immutable security policy, created before the session starts and mounted read-only into the browser container. No process inside the container can modify it. The policy is set by the user at configuration time and sealed.

When the agent sends any command (type, click, select), it first hits a permission gate. Is this URL even allowed? Then, for every placeholder in the value, a three-layer resolution check runs:

The agent sends { type: "edit", value: "SSN_001" }. On bank.com, the system substitutes the real SSN and types it into the form. On gmail.com, the system replaces the placeholder with an empty string. The agent gets { success: true } both times. It received no error, no warning, no indication that anything was different.

The Result

Same command. Agent can't tell the difference between resolve and strip. Prompt injection succeeds. Exfiltration fails.

No cat-and-mouse. No encoding trick, no clever split-across-messages strategy. There's nothing real to encode.

Run It Again

Let's go back to the beginning. Your AI agent opens a compromised email in Gmail. The prompt injection fires. The agent is fully compromised.

Layer 1: Data Isolation Blocked

The agent tries to access your routing number. All it has is ROUTING_001. The real number exists exclusively inside an attested hardware enclave your platform operator cannot inspect.

Layer 2: Human Control Blocked

The agent tries to draft an exfiltration email. You're watching its reasoning in real time. The orchestrator logs every action. One click to stop it, or take over the browser directly.

Layer 3: Browser Policy Blocked

The agent types ROUTING_001 into the Gmail compose field. The policy checks: Gmail is not an authorized destination for banking credentials. The placeholder is silently stripped. The email sends with an empty field. The agent doesn't know anything was removed.

Three independent layers, each one sufficient to stop the attack on its own, all running at the same time.

The Outcome

The prompt injection succeeded. The agent did everything the attacker wanted. Nothing happened.

The agent doesn't need your data to do its job. So we never give it any.

This is the security model behind RedactSure. We wrote a companion post on architectural anonymity, the principle that AI agents should be structurally isolated from the identity of the humans they serve.

Read: Your AI Doesn't Need to Know Who You Are →