https://baxbench.com/
BaxBench: Can LLMs Generate Secure and Correct Backends?
We introduce a novel benchmark to evaluate LLMs on secure and correct code generation, showing that even flagship LLMs are not ready for coding automation,...
generate securebaxbenchllmscorrectbackends