Robuta

https://huggingface.co/blog/NormalUhr/grpo-to-dapo-and-gspo
A Blog post by Yihua Zhang on Hugging Face
grpo
https://huggingface.co/spaces/Writer/GRPO-Any-Model
Train a language model using the GRPO technique with custom prompts and generate text responses. Users provide prompts and training parameters, and get...
hugging facegrpomodelspacewriter
https://ghost.oxen.ai/how-deepseek-r1-grpo-and-previous-deepseek-models-work/
In January 2025, DeepSeek took a shot directly at OpenAI by releasing a suite of models that “Rival OpenAI’s o1.” From their website: In the spirit of...
deepseekgrpopreviousmodelswork
https://ghost.oxen.ai/why-grpo-is-important-and-how-it-works/
Last week on Arxiv Dives we dug into research behind DeepSeek-R1, and uncovered that one of the techniques they use in the their training pipeline is called...
grpoimportantworks
https://huggingface.co/blog/liger-grpo
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
ligergrpomeetstrl