Sponsor of the Day:
Jerkmate
https://docs.vllm.ai/en/latest/api/vllm/model_executor/layers/quantization/compressed_tensors/compressed_tensors_moe/compressed_tensors_moe_w8a8_mxfp8/
compressed_tensors_moe_w8a8_mxfp8 - vLLM
compressed tensorsmoemxfp8vllm
https://docs.vllm.ai/en/latest/api/vllm/model_executor/layers/quantization/online/mxfp8/
mxfp8 - vLLM
mxfp8vllm
https://pytorch.org/blog/faster-diffusion-on-blackwell-mxfp8-and-nvfp4-with-diffusers-and-torchao/
Faster Diffusion on Blackwell: MXFP8 and NVFP4 with Diffusers and TorchAO – PyTorch
fasterdiffusionblackwellmxfp8nvfp4
https://huggingface.co/Intel/FLUX.1-dev-MXFP8-AutoRound-Recipe
Intel/FLUX.1-dev-MXFP8-AutoRound-Recipe · Hugging Face
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
flux 1 devautoround recipehugging faceintelmxfp8
https://pytorch.org/blog/enabling-up-to-41-faster-pre-training-mxfp8-and-deepep-for-deepseek-v3-on-b200-with-torchtitan/
Enabling Up to 41% Faster Pre-training: MXFP8 and DeepEP for DeepSeek-V3 on B200 with TorchTitan –...
pre trainingdeepseek v3enabling41faster