Robuta

https://programbench.com/task/o2sh__onefetch.e5958ce/ o2sh/onefetch — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. onefetchprogrambench