Robuta

Sponsor of the Day: Jerkmate
https://www.modular.com/blog/the-five-eras-of-kvcache Modular: The Five Eras of KVCache Feb 18, 2026 - vLLM, SGLang, TensorRT-LLM, and MAX Serve are all built on top of increasingly sophisticated KV cache management. This blog explores the evolution and role of... five erasmodularkvcache https://www.blocksandfiles.com/data-management/2022/04/10/kvcache/1594619 KVCache Oct 9, 2025 - KVCache – Key-Value Cache is a mechanism used to store past Gen AI large language model (LLM) layers’ activations (keys and values) during the inference phase.... kvcache https://www.alphaxiv.org/abs/2604.15039 Prefill-as-a-Service: KVCache of Next-Generation Models Could Go Cross-Datacenter | alphaXiv View recent discussion. Abstract: Prefill-decode (PD) disaggregation has become the standard architecture for large-scale LLM serving, but in practice its... next generationmodels couldprefillservicekvcache https://www.usenix.org/conference/fast25/presentation/qin Mooncake: Trading More Storage for Less Computation — A KVCache-centric Architecture for Serving... mooncaketradingstoragelesscomputation