Robuta

Sponsor of the Day: Jerkmate
https://www.aisi.gov.uk/blog/announcing-our-san-francisco-office Announcing our San Francisco office | AISI Work We are opening an office in San Francisco! This will enable us to hire more top talent, collaborate closely with the US AI Safety Institute and engage even... san francisco officeaisi workannouncing https://www.aisi.gov.uk/blog/early-insights-from-developing-question-answer-evaluations-for-frontier-ai Early Insights from Developing Question-Answer Evaluations for Frontier AI | AISI Work A common technique for quickly assessing AI capabilities is prompting models to answer hundreds of questions, then automatically scoring the answers. We share... early insightsquestion answerfrontier aiaisi workdeveloping https://www.aisi.gov.uk/blog/how-to-evaluate-control-measures-for-ai-agents How to evaluate control measures for AI agents? | AISI Work Our new paper outlines how AI control methods can mitigate misalignment risks as capabilities of AI systems increase control measuresai agentsaisi workevaluate https://www.aisi.gov.uk/blog/a-pipeline-for-transcript-analysis-using-inspect-scout A pipeline for transcript analysis using Inspect Scout | AISI Work We outline a step-by-step pipeline for using our open-source transcript analysis tool, Inspect Scout. analysis usingaisi workpipelinetranscriptinspect https://www.aisi.gov.uk/blog/deepening-our-partnership-with-google-deepmind Deepening our partnership with Google DeepMind | AISI Work Expanding our collaboration with a new research MOU google deepmindaisi workdeepeningpartnership https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos-previews-cyber-capabilities Our evaluation of Claude Mythos Preview’s cyber capabilities | AISI Work We conducted cyber evaluations of Anthropic’s Claude Mythos Preview and found continued improvement in capture-the-flag (CTF) challenges and significant... claude mythoscyber capabilitiesaisi workevaluation https://www.aisi.gov.uk/blog/from-bugs-to-bypasses-adapting-vulnerability-disclosure-for-ai-safeguards From bugs to bypasses: adapting vulnerability disclosure for AI safeguards | AISI Work Exploring how far cyber security approaches can help mitigate risks in generative AI systems, in collaboration with the National Cyber Security Centre (NCSC). vulnerability disclosureai safeguardsaisi workbugsbypasses https://www.aisi.gov.uk/blog/transcript-analysis-for-ai-agent-evaluations Transcript analysis for AI agent evaluations | AISI Work Why we use transcript analysis for our agent evaluations, and results from an early case study. ai agentaisi worktranscriptanalysisevaluations https://www.aisi.gov.uk/blog/evals-bounty Bounty programme for novel evaluations and agent scaffolding | AISI Work We are launching a bounty for novel evaluations and agent scaffolds to help assess dangerous capabilities in frontier AI systems. aisi workbountyprogrammenovelevaluations