What is AI agent memory poisoning?

Memory poisoning occurs when malicious or corrupted data is injected into an agent's persistent memory, causing it to make wrong decisions in future sessions. It's the agent equivalent of SQL injection — but for context.

How do I defend against agent memory poisoning?

Novyx provides four layers of defense: cryptographic chain hashing detects tampering, anomaly detection flags suspicious patterns, circuit breaker stops runaway loops, and instant rollback restores to a verified clean state.

How do I detect if my AI agent's memory has been tampered with?

Novyx creates a SHA-256 hash chain for every memory operation. The audit/verify endpoint checks chain integrity and flags any tampering. If corruption is detected, rollback restores the agent to a verified clean state in one API call.

Can AI agents be attacked through their memory?

Yes. Any agent with persistent memory is vulnerable to poisoning attacks — injecting false facts, manipulating importance scores, or corrupting context. Novyx is the only agent memory platform with built-in cryptographic tamper detection and automatic rollback.

← All guides

Your Agent's Memory Can Be Poisoned — Here's the Fix

AI agents that remember things across sessions are powerful. They're also a new attack surface.

The threat

In February 2026, Microsoft published research on AI Recommendation Poisoning — a class of attacks where malicious inputs embed persistence instructions that manipulate an agent's long-term memory. The agent stores the poisoned instruction, recalls it in future sessions, and acts on it as if it were legitimate context.

This isn't theoretical. If your agent uses persistent memory — and it should — you need to think about this.

How memory poisoning works

An attacker crafts input that looks like normal context but contains a hidden instruction
Your agent stores it as a memory (because that's what agents do)
Next session, the agent recalls that memory and treats it as trusted context
The poisoned memory influences decisions, recommendations, or actions

The attack is subtle because the memory looks legitimate. There's no malware, no exploit — just a carefully worded string that your agent trusts because it came from its own memory store.

Why traditional defenses don't help

✗ Input validation catches the first write — but what about memories created by the agent itself during normal operation?
✗ Rate limiting doesn't help — one poisoned memory is enough
✗ Access control doesn't help — the agent has legitimate write access to its own memory

The problem is that once something is in memory, most systems treat it as ground truth. There's no audit trail showing how it got there, no way to see what changed, and no way to undo it.

What actually works: defense in depth

1. Cryptographic audit trail

Every memory operation — create, update, delete — should be logged in an append-only chain with cryptographic hashes. Not just "what" changed, but "when" and "in what sequence."

This means if a memory is poisoned, you can trace exactly when it was written and what the agent's state looked like before and after.

python

from novyx import Novyx

nx = Novyx(api_key="nram_...")

# Every operation is automatically logged to the audit chain
nx.remember("User prefers dark mode", tags=["preferences"])

# Later: verify the full chain integrity
health = nx.audit_health()
# → {"chain_valid": true, "total_entries": 847, "gaps": 0}

2. Time-travel rollback

If you detect poisoned memories, you need to undo them. Not just delete the bad memory — restore your agent's entire memory state to a point before the poisoning occurred.

This is the difference between "delete the bad file" and "restore from backup." You need the backup.

python

# See what your agent's memory looked like 2 hours ago
preview = nx.rollback_preview(target="2 hours ago")
# → {"memories_to_restore": 3, "memories_to_remove": 12}

# Actually roll back
result = nx.rollback(target="2 hours ago")
# → {"restored": 3, "removed": 12, "audit_entry": "..."}

3. Policy engine for high-risk actions

Memory poisoning is dangerous because it leads to bad actions. The last line of defense is a policy engine that evaluates what your agent wants to do before it does it.

If a poisoned memory convinces your agent to exfiltrate data or make an unauthorized API call, the policy engine catches it regardless of how the agent got there.

python

# Agent wants to send data to an external endpoint
result = nx.action_submit(
    action_type="http_request",
    description="POST customer data to webhook",
    parameters={"url": "https://external-api.com/data", "method": "POST"}
)

# Policy engine evaluates in real-time
# → {"status": "blocked", "policy": "DataExfiltrationPolicy",
#    "reason": "External HTTP POST with sensitive data requires approval"}

The full defense stack

Layer	What it does	Novyx feature
Detect	Know when memory changed and how	Cryptographic audit trail
Investigate	See your agent's memory at any point in time	Replay timeline
Recover	Restore to a known-good state	Time-travel rollback
Prevent	Block dangerous actions regardless of memory state	Policy engine
Verify	Prove the chain of custody for every memory	Audit chain verification

No single layer is enough. A poisoned memory can bypass input validation, survive access controls, and look completely normal. You need the full stack: detect, investigate, recover, prevent, verify.

Get started

Novyx Core ships all five layers. Free tier includes audit trails and rollback preview. Pro includes full rollback, replay, and the policy engine.

bash

pip install novyx

python

from novyx import Novyx

nx = Novyx(api_key="nram_...")

# Store memories with automatic audit chain
nx.remember("Project deadline is March 30", tags=["project"], importance=8)

# Search with semantic recall
results = nx.recall("when is the deadline?")

# Full audit trail
audit = nx.audit(limit=50)

# Rollback if something goes wrong
nx.rollback(target="1 hour ago")

Read the full API docs or try the interactive demo.

References: Microsoft Security Blog, "AI Recommendation Poisoning," February 2026. NIST, "AI Agent Security Standards" (public comment period open).

Protect Your Agent's Memory

Audit trails, rollback, and policy enforcement. Free tier available.

Start Free

Start Free

Your Agent's Memory Can Be Poisoned — Here's the Fix

The threat

How memory poisoning works

Why traditional defenses don't help

What actually works: defense in depth

1. Cryptographic audit trail

2. Time-travel rollback

3. Policy engine for high-risk actions

The full defense stack

Get started

Protect Your Agent's Memory

Start Free

Contact Us