Robuta

Sponsor of the Day: Jerkmate

https://docs.vllm.ai/en/latest/api/vllm/model_executor/layers/quantization/compressed_tensors/compressed_tensors_moe/compressed_tensors_moe_w8a8_mxfp8/ compressed_tensors_moe_w8a8_mxfp8 - vLLM compressed tensors moe mxfp8 vllm https://docs.vllm.ai/en/latest/api/vllm/model_executor/layers/quantization/online/mxfp8/ mxfp8 - vLLM mxfp8 vllm https://pytorch.org/blog/faster-diffusion-on-blackwell-mxfp8-and-nvfp4-with-diffusers-and-torchao/ Faster Diffusion on Blackwell: MXFP8 and NVFP4 with Diffusers and TorchAO – PyTorch faster diffusion blackwell mxfp8 nvfp4 https://huggingface.co/Intel/FLUX.1-dev-MXFP8-AutoRound-Recipe Intel/FLUX.1-dev-MXFP8-AutoRound-Recipe · Hugging Face We’re on a journey to advance and democratize artificial intelligence through open source and open science. flux 1 dev autoround recipe hugging face intel mxfp8 https://pytorch.org/blog/enabling-up-to-41-faster-pre-training-mxfp8-and-deepep-for-deepseek-v3-on-b200-with-torchtitan/ Enabling Up to 41% Faster Pre-training: MXFP8 and DeepEP for DeepSeek-V3 on B200 with TorchTitan –... pre training deepseek v3 enabling 41 faster