Sponsor of the Day:
Jerkmate
https://galileo.ai/mastering-llm-as-a-judge
Mastering LLM as a Judge eBook: Improve AI Evaluations at Scale
Learn how to use LLM-as-a-Judge to accelerate AI evaluations, cut costs, and improve accuracy across complex AI workflows.
mastering llmimprove aijudgeebookevaluations
https://www.nist.gov/news-events/news/2026/03/announcement-caisi-signs-crada-openmined-enable-secure-ai-evaluations
Announcement: CAISI signs CRADA with OpenMined to Enable Secure AI Evaluations | NIST
Mar 27, 2026 - The Center for AI Standards and Innovation (CAISI) has signed a collaborative research and development agreement (CRADA) with
enable secureai evaluationsannouncementcaisisigns
https://www.irregular.com/publications/irregular-cybersecurity-talk-at-ai-evaluations-session-of-euc-ai-office
Irregular cybersecurity talk at AI evaluations session of EU AI office - Irregular
Irregular's CEO Dan Lahav was an invited expert at the European Commission's AI Office, addressing the vital connection between AI and cybersecurity. We are...
ai evaluationseu officeirregularcybersecuritytalk
https://galileo.ai/luna-2
Luna-2 | Galileo’s Small Language Models for AI Evaluations
Luna-2 is Galileo’s fast, low-cost family of small language models (SLMs) that power always-on agentic evaluations and guardrailing at enterprise scale.
small language modelsluna 2ai evaluations
https://blog.neurips.cc/2025/12/05/neurips-datasets-benchmarks-track-from-art-to-science-in-ai-evaluations/
NeurIPS Datasets & Benchmarks Track: From Art to Science in AI Evaluations – NeurIPS Blog
ai evaluationsneuripsdatasetsbenchmarkstrack