https://proceedings.mlr.press/v229/jayanthi23a.html
DROID: Learning from Offline Heterogeneous Demonstrations via Reward-Policy Distillation
Dec 2, 2023 - DROID: Learning from Offline Heterogeneous Demonstrations via Reward-Policy DistillationSravan Jayanthi, Letian Chen, Nadya Balabanska, Van Du...
learning fromreward policydroidofflineheterogeneous