Token cost amplification via context flooding
Attackers can craft content that causes AI agents to consume excessive tokens — by inserting large files, repetitive instructions, or recursive prompts that bloat the context window. This inflates API costs and can cause rate-limit denial of service.
What happens
An attacker crafts content that causes the agent to consume excessive tokens — by embedding large base64 blobs, repetitive instructions, or recursive prompts that cause the agent to re-read the same files. This inflates API costs and can trigger rate limits.
How the attack unfolds
What it looks like in practice
A developer asks Claude Code to review a file that contains a 500KB base64-encoded "image". Claude reads the file, and the entire base64 blob enters the context window. On every subsequent turn, the blob is re-sent to the API, consuming 125K tokens per turn. After 10 turns, the developer has spent $50 on a single session.
How Guard catches this
How to stop it
Use Guard to set context window size limits. Flag when an agent reads unusually large files or processes repetitive content. Monitor token consumption per session.
Common questions
More threats to know about
Excessive file reading during project exploration
When AI agents explore a project to understand its structure, they often read dozens or hundreds of files — far more than needed for the task. This excessive reading can expose secrets, proprietary code, and customer data that enter the context window and model API.
Read advisoryContext window scraping via long file reads
AI agents that read large files can leak proprietary code, internal documentation, and customer data into their context window — which may then be sent to external LLM APIs or logged in cloud telemetry.
Read advisoryStop this threat before it reaches your agent
Install HOL Guard to get real-time protection against this attack and others like it.