aisi work - Robuta Search

https://www.aisi.gov.uk/blog/announcing-our-san-francisco-office Announcing our San Francisco office | AISI Work We are opening an office in San Francisco! This will enable us to hire more top talent, collaborate closely with the US AI Safety Institute and engage even... san francisco office aisi work announcing https://www.aisi.gov.uk/blog/early-insights-from-developing-question-answer-evaluations-for-frontier-ai Early Insights from Developing Question-Answer Evaluations for Frontier AI | AISI Work A common technique for quickly assessing AI capabilities is prompting models to answer hundreds of questions, then automatically scoring the answers. We share... early insights question answer frontier ai aisi work developing https://www.aisi.gov.uk/blog/how-to-evaluate-control-measures-for-ai-agents How to evaluate control measures for AI agents? | AISI Work Our new paper outlines how AI control methods can mitigate misalignment risks as capabilities of AI systems increase control measures ai agents aisi work evaluate https://www.aisi.gov.uk/blog/a-pipeline-for-transcript-analysis-using-inspect-scout A pipeline for transcript analysis using Inspect Scout | AISI Work We outline a step-by-step pipeline for using our open-source transcript analysis tool, Inspect Scout. analysis using aisi work pipeline transcript inspect https://www.aisi.gov.uk/blog/deepening-our-partnership-with-google-deepmind Deepening our partnership with Google DeepMind | AISI Work Expanding our collaboration with a new research MOU google deepmind aisi work deepening partnership https://www.aisi.gov.uk/blog/our-evaluation-of-claude-mythos-previews-cyber-capabilities Our evaluation of Claude Mythos Preview’s cyber capabilities | AISI Work We conducted cyber evaluations of Anthropic’s Claude Mythos Preview and found continued improvement in capture-the-flag (CTF) challenges and significant... claude mythos cyber capabilities aisi work evaluation https://www.aisi.gov.uk/blog/from-bugs-to-bypasses-adapting-vulnerability-disclosure-for-ai-safeguards From bugs to bypasses: adapting vulnerability disclosure for AI safeguards | AISI Work Exploring how far cyber security approaches can help mitigate risks in generative AI systems, in collaboration with the National Cyber Security Centre (NCSC). vulnerability disclosure ai safeguards aisi work bugs bypasses https://www.aisi.gov.uk/blog/transcript-analysis-for-ai-agent-evaluations Transcript analysis for AI agent evaluations | AISI Work Why we use transcript analysis for our agent evaluations, and results from an early case study. ai agent aisi work transcript analysis evaluations https://www.aisi.gov.uk/blog/evals-bounty Bounty programme for novel evaluations and agent scaffolding | AISI Work We are launching a bounty for novel evaluations and agent scaffolds to help assess dangerous capabilities in frontier AI systems. aisi work bounty programme novel evaluations