https://arxiv.org/abs/2406.18820
[2406.18820] Universal Checkpointing: A Flexible and Efficient Distributed Checkpointing System for...
Abstract page for arXiv paper 2406.18820: Universal Checkpointing: A Flexible and Efficient Distributed Checkpointing System for Large-Scale DNN Training with...
universalcheckpointingflexibleefficientdistributed
https://keras.io/guides/orbax_checkpoint/
Orbax Checkpointing in Keras
Keras documentation: Orbax Checkpointing in Keras
checkpointingkeras
https://sparkco.ai/blog/mastering-langgraph-checkpointing-best-practices-for-2025
Mastering LangGraph Checkpointing: Best Practices for 2025
Apr 23, 2026 - Explore advanced LangGraph checkpointing techniques for durability, safety, and scalability in 2025. A must-read for developers.
best practicesmasteringlanggraphcheckpointing
https://developers.googleblog.com/boost-training-goodput-how-continuous-checkpointing-optimizes-reliability-in-orbax-and-maxtext/
Boost Training Goodput: How Continuous Checkpointing Optimizes Reliability in Orbax and MaxText -...
Optimize AI model training reliability and performance using continuous checkpointing in Orbax and MaxText. Maximize I/O bandwidth and minimize resource waste...
boosttrainingcontinuouscheckpointingreliability
https://www.deepspeed.ai/tutorials/universal-checkpointing/
Universal Checkpointing with DeepSpeed: A Practical Guide - DeepSpeed
DeepSpeed Universal Checkpointing feature is a powerful tool for saving and loading model checkpoints in a way that is both efficient and flexible, enabling...
practical guideuniversalcheckpointingdeepspeed
https://docs.ray.io/en/latest/rllib/checkpoints.html
Checkpointing — Ray 2.55.1
checkpointingray
https://geminicli.com/docs/cli/checkpointing/
Checkpointing | Gemini CLI
gemini clicheckpointing