Robuta

https://bentoml.com/llm/
A practical handbook for engineers building, optimizing, scaling and operating LLM inference systems in production.
llm inference handbook