AgentRedBench: Dynamic Redteaming and Integration-Aware Defense for LLM Agents over SaaS Integrations

arXiv:2606.02240Submitted Jun 3, 20260 benchmark results

Authors pending

Abstract

215 indirect prompt injection scenarios across 24 enterprise integrations; guard model cuts attack success rate from 69.9% to 2.4%.

Tasks

Results

No benchmark results recorded yet.

Benchmark results referencing this paper haven't been added to the registry yet. If you have a reproduction, submit it →

CodeSOTA extraction

Extract AgentRedBench ASR per model and per attack type to confirm the 2.4% ASR with AgentRedGuard at 0.37% FPR.

Add or update benchmark results

Logged-in editor · benchmark trail