Robuta

https://supereval.com/blog/
May 21, 2024 - The SuperEval blog contains best practices and resources relevant to conducting superintendent evaluations.
evaluation resourcessuperintendentblog
https://amecorg.com/resources/
Jun 11, 2025
resourcesinternationalassociationmeasurementevaluation
https://hr.mcmaster.ca/managers/job-design-job-evaluation/
Jun 10, 2023
human resourcesjobdesignevaluation
https://evaluations.metr.org/gpt-5-report/
Aug 6, 2025 - We evaluate whether GPT-5 poses significant catastrophic risks via AI self-improvement, rogue replication, or sabotage of AI labs. We conclude that this seems...
detailsevaluationopenaigptautonomy
https://evaluations.metr.org/gpt-4o-report/
Aug 7, 2024 - We measured the performance of GPT-4o given a simple agent scaffolding on 77 tasks across 30 task families testing autonomous capabilities.
detailspreliminaryevaluationgptautonomy