Contact
DMCA
Privacy
Robuta
https://arxiv.org/abs/2506.22777
[2506.22777] Teaching Models to Verbalize Reward Hacking in Chain-of-Thought Reasoning
Abstract page for arXiv paper 2506.22777: Teaching Models to Verbalize Reward Hacking in Chain-of-Thought Reasoning
reward hacking
teaching
models
verbalize
chain