Robuta

https://github.com/brianwang00001/sparse-vggt
Code for Faster VGGT with Block-Sparse Global Attention - brianwang00001/sparse-vggt
githubsparsecodefasterblock
https://www.digitimes.com/newsshow/emailnews.asp?datePublish=2025/02/20&pages=PD&seq=217
Email a friend about: DeepSeek and Moonshot AI advance sparse attention tech research
moonshot aisparse attentiontech researchdeepseekadvance
https://openreview.net/forum?id=Hjk1tWIdvL&referrer=%5Bthe%20profile%20of%20Mingbao%20Lin%5D(%2Fprofile%3Fid%3D~Mingbao_Lin1)
Pre-filling Large Language Models (LLMs) with long-context inputs is computationally expensive due to the quadratic complexity of full attention. While global...
sparse attentionhierarchyaidedfastllms
https://huggingface.co/papers/2502.14866
Join the discussion on this paper page
paperefficientlongsequencellm
https://openreview.net/forum?id=zJSZupQ889
Large Language Models (LLMs) capable of handling extended contexts are in high demand, yet their inference remains challenging due to substantial Key-Value...
sparse attentionlatent spacesalskvcache