Robuta

https://programbench.com/task/mgechev__revive.201451e/ mgechev/revive — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. reviveprogrambench https://programbench.com/task/ogham__dog.721440b/ ogham/dog — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. oghamdogprogrambench https://programbench.com/task/samtools__samtools.aa823b5/ samtools/samtools — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. samtoolsprogrambench https://programbench.com/task/abishekvashok__cmatrix.5c082c6/ abishekvashok/cmatrix — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. cmatrixprogrambench https://programbench.com/task/ammarabouzor__tui-journal.2b4540d/ AmmarAbouZor/tui-journal — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. tuijournalprogrambench https://programbench.com/task/stathissideris__ditaa.f2286c4/ stathissideris/ditaa — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. ditaaprogrambench https://www.simplenews.ai/news/facebook-researchs-programbench-shows-0percent-of-ai-models-can-rebuild-complete-programs-from-scratch-o1a0 Facebook Research's ProgramBench Shows 0% of AI Models Can Rebuild Complete Programs From Scratch |... May 7, 2026 - Facebook Research's new ProgramBench benchmark reveals that 0% of evaluated AI models can fully reconstruct complete working programs from binaries and... facebook researchai modelscomplete programsfrom scratch https://programbench.com/task/google__brotli.b3dc9cc/ google/brotli — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. googlebrotliprogrambench https://programbench.com/task/wintermute-cell__ngrrram.8ea13c3/ wintermute-cell/ngrrram — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. wintermutecellprogrambench https://programbench.com/task/drew-alleman__datasurgeon.d257cee/ Drew-Alleman/DataSurgeon — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. drewallemanprogrambench https://programbench.com/task/crowdagger__crowbook.ea214d7/ crowdagger/crowbook — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. crowbookprogrambench https://programbench.com/task/unhappychoice__gittype.34b72d0/ unhappychoice/gittype — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. programbench https://programbench.com/task/antonmedv__walk.bf802ef/ antonmedv/walk — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. walkprogrambench https://programbench.com/task/madler__pigz.fe4894f/ madler/pigz — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. pigzprogrambench https://programbench.com/task/hairyhenderson__gomplate.05eb3aa/ hairyhenderson/gomplate — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. gomplateprogrambench https://programbench.com/task/junegunn__fzf.b56d614/ junegunn/fzf — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. fzfprogrambench https://programbench.com/task/svenstaro__genact.16f96e3/ svenstaro/genact — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. genactprogrambench https://programbench.com/task/shashwatah__jot.a92aad8/ shashwatah/jot — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. jotprogrambench https://programbench.com/task/danmar__cppcheck.0a5b103/ danmar/cppcheck — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. danmarcppcheckprogrambench https://programbench.com/task/ducaale__xh.4a6e44f/ ducaale/xh — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. xhprogrambench https://programbench.com/team/ Team — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. teamprogrambench https://programbench.com/task/stranger6667__jsonschema.d52e881/ Stranger6667/jsonschema — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. jsonschemaprogrambench https://programbench.com/task/jonas__tig.8334123/ jonas/tig — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. jonastigprogrambench https://programbench.com/task/chmln__handlr.90e78ba/ chmln/handlr — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. programbench https://programbench.com/task/chmln__sd.87d1ba5/ chmln/sd — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. sdprogrambench https://programbench.com/task/byron__dua-cli.8570c15/ Byron/dua-cli — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. byronduacliprogrambench https://arxiv.org/abs/2605.03546v1 [2605.03546v1] ProgramBench: Can Language Models Rebuild Programs From Scratch? Abstract page for arXiv paper 2605.03546v1: ProgramBench: Can Language Models Rebuild Programs From Scratch? language modelsfrom scratchprogrambenchrebuildprograms https://programbench.com/task/lfos__calcurse.49180d5/ lfos/calcurse — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. lfoscalcurseprogrambench https://programbench.com/task/mibk__dupl.1bf052b/ mibk/dupl — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. mibkprogrambench https://programbench.com/task/zk-org__zk.10d93d5/ zk-org/zk — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. zkprogrambench https://programbench.com/task/yoav-lavi__melody.f4af9b4/ yoav-lavi/melody — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. yoavlavimelodyprogrambench https://programbench.com/task/o2sh__onefetch.e5958ce/ o2sh/onefetch — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. onefetchprogrambench https://programbench.com/task/stacked-git__stgit.430027d/ stacked-git/stgit — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. stackedgitprogrambench https://programbench.com/task/segmentio__chamber.5f93f5f/ segmentio/chamber — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. chamberprogrambench https://programbench.com/task/mkj__dropbear.75f699b/ mkj/dropbear — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. mkjdropbearprogrambench https://programbench.com/task/robertdavidgraham__masscan.b99d433/ robertdavidgraham/masscan — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. masscanprogrambench https://programbench.com/task/elkowar__pipr.fae0b17/ elkowar/pipr — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. piprprogrambench https://programbench.com/task/dalance__amber.69a0f52/ dalance/amber — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. dalanceamberprogrambench https://programbench.com/task/sirwart__ripsecrets.34c9e03/ sirwart/ripsecrets — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. programbench https://programbench.com/task/zevv__duc.a58fa4e/ zevv/duc — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. ducprogrambench https://programbench.com/ ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. programbench https://programbench.com/task/rbakbashev__elfcat.52f8cc7/ rbakbashev/elfcat — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. elfcatprogrambench https://programbench.com/task/sigoden__argc.04a08f1/ sigoden/argc — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. programbench https://programbench.com/task/sharkdp__fd.40d8eb3/ sharkdp/fd — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. fdprogrambench https://programbench.com/task/peco__peco.4e58dad/ peco/peco — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. pecoprogrambench https://programbench.com/task/noborus__trdsql.d8c5ff6/ noborus/trdsql — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. programbench https://programbench.com/task/brocode__fblog.3b54330/ brocode/fblog — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. fblogprogrambench https://programbench.com/task/johnkerl__miller.8d85b46/ johnkerl/miller — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. millerprogrambench https://programbench.com/task/git-bahn__git-graph.87b4473/ git-bahn/git-graph — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. gitbahngraphprogrambench https://programbench.com/task/jrnxf__thokr.09375ef/ jrnxf/thokr — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. programbench https://programbench.com/task/alexpovel__srgn.89f943b/ alexpovel/srgn — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. srgnprogrambench https://programbench.com/task/lymphatus__caesium-clt.a529b2e/ Lymphatus/caesium-clt — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. caesiumcltprogrambench https://programbench.com/task/nachoparker__dutree.44e877d/ nachoparker/dutree — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. dutreeprogrambench https://programbench.com/extended/ Extended Results — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. extended resultsprogrambench https://arxiv.org/abs/2605.03546 [2605.03546] ProgramBench: Can Language Models Rebuild Programs From Scratch? Abstract page for arXiv paper 2605.03546: ProgramBench: Can Language Models Rebuild Programs From Scratch? language modelsfrom scratchprogrambenchrebuildprograms https://programbench.com/task/yaa110__nomino.f892499/ yaa110/nomino — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. programbench https://programbench.com/task/sharkdp__bat.f822bd0/ sharkdp/bat — ProgramBench ProgramBench evaluates whether language models can rebuild programs from scratch. batprogrambench