Robuta

https://www.modular.com/blog/software-pipelining-for-gpu-kernels-part-1-the-pipeline-problem Modular: Software Pipelining for GPU Kernels: Part 1 - The Pipeline Problem Mar 31, 2026 - Flash Attention is a simple algorithm: tiled back-to-back matmuls with an online softmax algorithm in between. The algorithm fits in a few dozen lines of... gpu kernelsmodularsoftwarepipeliningpart https://www.rightnowai.co/editor RightNow AI - The Best All-in-One AI Code Editor for GPU Kernels | 2025 | RightNow AI Jan 15, 2025 - RightNow AI is the best and only all-in-one AI-powered code editor for NVIDIA GPU kernel developers. Features custom agents, skills and MCP support, clear... one codegpu kernelsrightnowaibest https://docs.jax.dev/en/latest/notebooks/cute_dsl_jax.html Writing High-Performance GPU Kernels with CuTe DSL and JAX — JAX documentation high performancegpu kernelswritingcutedsl https://www.modular.com/blog/tiletensor-part-1-safer-more-efficient-gpu-kernels Modular: TileTensor Part 1 - Safer, More Efficient GPU Kernels Apr 17, 2026 - Suppose you want to load a 2D tile of a matrix, where the tile is stored in shared memory in a specific interleaved layout to avoid bank conflicts. This... modularpartsaferefficientgpu