Robuta

https://blog.llamafactory.net/en/posts/ktransformers_dpo/ RL-DPO Training with KTransformers and LLaMA-Factory | LlamaFactory Blog Dec 23, 2025 - This tutorial demonstrates how to fine-tune a language model using the LLaMA-Factory framework with Direct Preference Optimization (DPO). DPO is a training... dpo trainingllama factoryrlktransformersblog https://ktransformers.net/en/blog KTransformers - Flexible LLM Inference Framework A flexible Python-centric framework for LLM inference optimization. Run DeepSeek-R1-671B on a single RTX 4090 with optimized performance. llm inferencektransformersflexibleframework