Robuta

https://proceedings.mlr.press/v229/jayanthi23a.html DROID: Learning from Offline Heterogeneous Demonstrations via Reward-Policy Distillation Dec 2, 2023 - DROID: Learning from Offline Heterogeneous Demonstrations via Reward-Policy DistillationSravan Jayanthi, Letian Chen, Nadya Balabanska, Van Du... learning fromreward policydroidofflineheterogeneous