Sponsor of the Day:
Jerkmate
https://www.modular.com/blog/the-five-eras-of-kvcache
Modular: The Five Eras of KVCache
Feb 18, 2026 - vLLM, SGLang, TensorRT-LLM, and MAX Serve are all built on top of increasingly sophisticated KV cache management. This blog explores the evolution and role of...
five erasmodularkvcache
https://www.blocksandfiles.com/data-management/2022/04/10/kvcache/1594619
KVCache
Oct 9, 2025 - KVCache – Key-Value Cache is a mechanism used to store past Gen AI large language model (LLM) layers’ activations (keys and values) during the inference phase....
kvcache
https://www.alphaxiv.org/abs/2604.15039
Prefill-as-a-Service: KVCache of Next-Generation Models Could Go Cross-Datacenter | alphaXiv
View recent discussion. Abstract: Prefill-decode (PD) disaggregation has become the standard architecture for large-scale LLM serving, but in practice its...
next generationmodels couldprefillservicekvcache
https://www.usenix.org/conference/fast25/presentation/qin
Mooncake: Trading More Storage for Less Computation — A KVCache-centric Architecture for Serving...
mooncaketradingstoragelesscomputation