Context window scraping via long file reads
AI agents that read large files can leak proprietary code, internal documentation, and customer data into their context window — which may then be sent to external LLM APIs or logged in cloud telemetry.
What happens
An agent reads files beyond its task scope — proprietary code, internal docs, customer data, or credentials — and that data enters the LLM context window. If the model API logs prompts or the agent sends data to external tools, the data may be exfiltrated.
How the attack unfolds
What it looks like in practice
A developer asks Claude Code to fix a bug in src/api/handlers.ts. Claude reads the file, but also reads src/api/handlers.test.ts, src/api/middleware.ts, and src/config/database.ts — which contains the production database connection string. All of this enters the context window and is sent to the model API.
How Guard catches this
How to stop it
Limit the files and directories agents can read. Use Guard to flag reads of large files or directories outside the project scope. Review what data is sent to external LLM APIs.
Common questions
More threats to know about
Environment file exfiltration via webhook
AI agents can be tricked into reading .env files and sending their contents to external endpoints through tool calls, webhook integrations, or HTTP requests that appear legitimate.
Read advisoryShadow MCP server discovery and persistent access
MCP servers added to a project during development can persist in configuration files and maintain access to the agent’s context window long after they are forgotten. These "shadow" servers continue receiving tool calls and may be modified by attackers who compromise the original server.
Read advisoryStop this threat before it reaches your agent
Install HOL Guard to get real-time protection against this attack and others like it.