Robuta

https://arxiv.org/abs/2412.16339 [2412.16339] Deliberative Alignment: Reasoning Enables Safer Language Models Abstract page for arXiv paper 2412.16339: Deliberative Alignment: Reasoning Enables Safer Language Models deliberative alignmentsafer languagereasoningenablesmodels