Robuta

https://openreview.net/forum?id=mhLh7rXt7g&referrer=%5Bthe%20profile%20of%20Zhe%20Jin%5D(%2Fprofile%3Fid%3D~Zhe_Jin4)
Pipeline parallelism is a cornerstone of large-scale model training, yet its efficiency is fundamentally limited by straggler-induced pipeline bubbles. This...
pipeline parallelismconductormultigranularitycontrol