Robuta

https://www.snowflake.com/en/engineering-blog/arctic-long-sequence-training-multi-million-token-ai/ Arctic Long Sequence Training (ALST): Scalable Training for Multi-Million Token AI Models Snowflake's ALST enables scalable training of long-context models with up to 15 million tokens using Hugging Face and DeepSpeed, all without custom modeling... long sequencemulti million https://www.deisenroth.cc/publication/cunningham-2024/ Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling | Marc Deisenroth multi resolution convolutions