Prompt Injection and AI Agent Attacks Emerge as Major Security Threat

According to MIT Technology Review, artificial intelligence agents have become a new attack vector for hackers, with prompt injection techniques targeting both human-supervised and autonomous AI workflows.

The article references two specific incidents as examples of this emerging threat pattern. The first involves a “Gemini Calendar prompt-injection attack of 2026,” though this appears to be a future-dated reference that requires clarification. The second incident cited occurred in September 2025, described as “a state-sponsored hack using Anthropic’s Claude code as an automated intrusion engine.”

MIT Technology Review characterizes these attacks as targeting “the coercion of human-in-the-loop agentic actions and fully autonomous agentic workflows,” suggesting that both supervised and unsupervised AI systems present security vulnerabilities.

The article’s headline, “Rules fail at the prompt, succeed at the boundary,” implies that security measures may be more effective when applied at system boundaries rather than at the prompt level, though the excerpt provided does not elaborate on this distinction.

Note: The source material appears incomplete, ending mid-sentence with “In the A.” Some details, particularly the 2026 date reference, may require verification.