Recent Papers / arXiv:2606.02380
SPADE-Bench: Evaluating Spontaneous Strategic Deception in Agents via Plan-Action Divergence
Yuyan Bu, Haowei Li, Qirui Zheng, Bowen Dong, Kaiyue Yang, Jiaming Ji, Yingshui Tan, Wenxin Li, Yaodong Yang, Juntao Dai
Abstract
First benchmark to isolate agent deception (plan-action divergence under pressure) from hallucination; reveals that deception is a genuine and pressing issue in tool-use contexts.
Tasks
editResults
No benchmark results recorded yet.
Benchmark results referencing this paper haven't been added to the registry yet. If you have a reproduction, submit it →
CodeSOTA extraction
Benchmark evidence
- SPADE-Bench: Leakage rate and H-score across models (extract from main results).