Robuta

https://programbench.com/task/robertdavidgraham__masscan.b99d433/ robertdavidgraham/masscan — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. masscanprogrambench