https://openreview.net/forum?id=b0qRSUcQP7&referrer=%5Bthe%20profile%20of%20Xianpei%20Han%5D(%2Fprofile%3Fid%3D~Xianpei_Han1)
Multimodal Reward Models (MM-RMs) are crucial for aligning Large Language Models (LLMs) with human preferences, particularly as LLMs increasingly interact with...
the devilspurious correlationsdetailstacklingunimodal
https://www.arxiv.org/abs/2601.06424
Abstract page for arXiv paper 2601.06424: Can a Unimodal Language Agent Provide Preferences to Tune a Multimodal Vision-Language Model?
unimodallanguageagentprovidepreferences