AGENT ARENA
How manipulation-proof is your AI agent? Send it to a page full of hidden prompt injection attacks and find out.
How It Works
Point your AI agent at the test page and ask it to summarize the content.
Copy your agent's response and paste it into the scorecard below.
Instantly see which hidden attacks your agent fell for.
Send your agent to: ref.jock.pl/modern-web
Scorecard
Challenge Catalog
10 attack vectors ordered by difficulty. Canary phrases are hidden — only revealed after analysis.
Understanding Prompt Injection
Prompt injection is an attack where adversarial instructions are hidden in content that an AI agent processes. When an agent reads a web page, email, or document, hidden instructions can trick it into changing its behavior.
Why It Matters
- Agents browsing the web are exposed to content they didn't choose
- Hidden instructions can exfiltrate data, alter outputs, or bypass safety filters
- Most attacks are invisible to the human supervising the agent
- Defense requires awareness at both the model and application layer
Attack Categories
White-on-white text, micro text, off-screen content. The text is there, but humans can't see it.
HTML comments, hidden divs, data attributes. Uses the structure of HTML itself as camouflage.
ARIA attributes, alt text overrides. Exploits accessibility and metadata channels.
Zero-width characters, Unicode exploits. The message is invisible at the character level.