https://arxiv.org/abs/2212.03597
[2212.03597] DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training...
Abstract page for arXiv paper 2212.03597: DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling...
deep learningdeepspeeddataefficiencyimproving
https://arxiv.org/abs/2201.05596
[2201.05596] DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power...
Abstract page for arXiv paper 2201.05596: DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale
mixture of expertsdeepspeedmoeadvancinginference
https://deepspeed.readthedocs.io/en/latest/
DeepSpeed — DeepSpeed 0.18.10 documentation
deepspeeddocumentation
https://arxiv.org/abs/2207.00032
[2207.00032] DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at...
Abstract page for arXiv paper 2207.00032: DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale
transformer modelsdeepspeedinferenceenablingefficient
https://huggingface.co/docs/transformers/deepspeed
DeepSpeed ZeRO · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
hugging facedeepspeedzero
https://parlance-labs.com/education/fine_tuning/zach.html
FSDP, DeepSpeed and Accelerate – Parlance
Advanced techniques and practical considerations for fine-tuning large language models, comparing tools, discussing model precision and optimization, and...
deepspeedaccelerate
https://arxiv.org/abs/2308.01320
[2308.01320] DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All...
Abstract page for arXiv paper 2308.01320: DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales
deepspeedchateasyfastaffordable
https://www.deepspeed.ai/tutorials/pytorch-profiler/
Using PyTorch Profiler with DeepSpeed for performance debugging - DeepSpeed
This tutorial describes how to use PyTorch Profiler with DeepSpeed.
pytorch profilerusingdeepspeedperformancedebugging
https://www.microsoft.com/en-us/research/blog/using-deepspeed-and-megatron-to-train-megatron-turing-nlg-530b-the-worlds-largest-and-most-powerful-generative-language-model/
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most...
We are excited to introduce the DeepSpeed- and Megatron-powered Megatron-Turing Natural Language Generation model (MT-NLG), the largest and the most powerful...
usingdeepspeedmegatrontrainturing
https://www.tutorialspoint.com/deepspeed/index.htm
DeepSpeed Tutorial
DeepSpeed is a powerful deep learning optimization library that makes it possible to overcome many challenges while training large-scale models. It allows us...
deepspeedtutorial
https://pytorch.org/projects/deepspeed/
DeepSpeed – PyTorch
deepspeedpytorch
https://huggingface.co/docs/accelerate/usage_guides/deepspeed
DeepSpeed · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
hugging facedeepspeed
https://www.deepspeed.ai/
Latest News - DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
latest newsdeepspeed
https://www.microsoft.com/en-us/research/blog/deepspeed-zero-a-leap-in-speed-for-llm-and-chat-model-training-with-4x-less-communication/
DeepSpeed ZeRO++: A leap in speed for LLM and chat model training with 4X less communication -...
Jun 22, 2023 - A new system of communication optimization strategies built on top of ZeRO offers unmatched efficiency for large model training, regardless of batch size...
model trainingdeepspeedzeroleapllm