https://sebastianraschka.com/llm-architecture-gallery/gqa/
Grouped-Query Attention (GQA) | Sebastian Raschka, PhD
Mar 21, 2026 - A gallery-local explainer for Grouped-Query Attention, based on the architecture comparison articles and LLMs-from-scratch.
grouped query attentionsebastian raschkagqaphd
https://cookllm.com/docs/fundamentals/systems/flash-attention/05-grouped-query-attention
Grouped Query Attention | CookLLM | 大模型全栈工程体系:架构、训练与应用
实现 GQA/MQA 支持,让多个 Query Head 共享 KV,优化 KV Cache 内存占用。
grouped query attentioncookllm