Robuta

https://www.anthropic.com/research/constitutional-classifiers Constitutional Classifiers: Defending against universal jailbreaks \ Anthropic A paper from Anthropic describing a new way to guard LLMs against jailbreaking constitutional classifiersdefendinguniversaljailbreaksanthropic