Your Agent's Memory Can Be Poisoned — Here's the Fix
AI agents that remember things across sessions are powerful. They're also a new attack surface.
The threat
In February 2026, Microsoft published research on AI Recommendation Poisoning — a class of attacks where malicious inputs embed persistence instructions that manipulate an agent's long-term memory. The agent stores the poisoned instruction, recalls it in future sessions, and acts on it as if it were legitimate context.
This isn't theoretical. If your agent uses persistent memory — and it should — you need to think about this.
How memory poisoning works
- An attacker crafts input that looks like normal context but contains a hidden instruction
- Your agent stores it as a memory (because that's what agents do)
- Next session, the agent recalls that memory and treats it as trusted context
- The poisoned memory influences decisions, recommendations, or actions
The attack is subtle because the memory looks legitimate. There's no malware, no exploit — just a carefully worded string that your agent trusts because it came from its own memory store.
Why traditional defenses don't help
- ✗ Input validation catches the first write — but what about memories created by the agent itself during normal operation?
- ✗ Rate limiting doesn't help — one poisoned memory is enough
- ✗ Access control doesn't help — the agent has legitimate write access to its own memory
The problem is that once something is in memory, most systems treat it as ground truth. There's no audit trail showing how it got there, no way to see what changed, and no way to undo it.
What actually works: defense in depth
1. Cryptographic audit trail
Every memory operation — create, update, delete — should be logged in an append-only chain with cryptographic hashes. Not just "what" changed, but "when" and "in what sequence."
This means if a memory is poisoned, you can trace exactly when it was written and what the agent's state looked like before and after.
from novyx import Novyx
nx = Novyx(api_key="nram_...")
# Every operation is automatically logged to the audit chain
nx.remember("User prefers dark mode", tags=["preferences"])
# Later: verify the full chain integrity
health = nx.audit_health()
# → {"chain_valid": true, "total_entries": 847, "gaps": 0}2. Time-travel rollback
If you detect poisoned memories, you need to undo them. Not just delete the bad memory — restore your agent's entire memory state to a point before the poisoning occurred.
This is the difference between "delete the bad file" and "restore from backup." You need the backup.
# See what your agent's memory looked like 2 hours ago
preview = nx.rollback_preview(target="2 hours ago")
# → {"memories_to_restore": 3, "memories_to_remove": 12}
# Actually roll back
result = nx.rollback(target="2 hours ago")
# → {"restored": 3, "removed": 12, "audit_entry": "..."}3. Policy engine for high-risk actions
Memory poisoning is dangerous because it leads to bad actions. The last line of defense is a policy engine that evaluates what your agent wants to do before it does it.
If a poisoned memory convinces your agent to exfiltrate data or make an unauthorized API call, the policy engine catches it regardless of how the agent got there.
# Agent wants to send data to an external endpoint
result = nx.action_submit(
action_type="http_request",
description="POST customer data to webhook",
parameters={"url": "https://external-api.com/data", "method": "POST"}
)
# Policy engine evaluates in real-time
# → {"status": "blocked", "policy": "DataExfiltrationPolicy",
# "reason": "External HTTP POST with sensitive data requires approval"}The full defense stack
| Layer | What it does | Novyx feature |
|---|---|---|
| Detect | Know when memory changed and how | Cryptographic audit trail |
| Investigate | See your agent's memory at any point in time | Replay timeline |
| Recover | Restore to a known-good state | Time-travel rollback |
| Prevent | Block dangerous actions regardless of memory state | Policy engine |
| Verify | Prove the chain of custody for every memory | Audit chain verification |
No single layer is enough. A poisoned memory can bypass input validation, survive access controls, and look completely normal. You need the full stack: detect, investigate, recover, prevent, verify.
Get started
Novyx Core ships all five layers. Free tier includes audit trails and rollback preview. Pro includes full rollback, replay, and the policy engine.
pip install novyxfrom novyx import Novyx
nx = Novyx(api_key="nram_...")
# Store memories with automatic audit chain
nx.remember("Project deadline is March 30", tags=["project"], importance=8)
# Search with semantic recall
results = nx.recall("when is the deadline?")
# Full audit trail
audit = nx.audit(limit=50)
# Rollback if something goes wrong
nx.rollback(target="1 hour ago")Read the full API docs or try the interactive demo.
References: Microsoft Security Blog, "AI Recommendation Poisoning," February 2026. NIST, "AI Agent Security Standards" (public comment period open).
Protect Your Agent's Memory
Audit trails, rollback, and policy enforcement. Free tier available.