Robuta

https://docs.pytorch.org/docs/stable/fsdp.html FullyShardedDataParallel — PyTorch 2.11 documentation fullyshardeddataparallelpytorchdocumentation