Robuta

https://mlcommons.org/2025/01/mlc-medical-collab-eval-medical-imaging/ MLCommons Medical Working Group Co-authors Book Chapter on Collaborative Evaluation for Medical... Jan 28, 2025 - MLCommons Medical working group co-authored a chapter for a new book on AI in medical imaging, “Collaborative evaluation for performance assessment of medical... medical working groupco authorsbook chapter https://mlcommons.org/research/ Machine Learning Research | MLCommons Research Mar 18, 2025 - Democratizing machine learning through an open community approach. MLCommons is a community driven research effort. machine learning researchmlcommons https://partnershiponai.org/partnership-on-ai-welcomes-dla-piper-ellis-alicante-mlcommons-open-library-foundation-and-windfall-trust-as-partners/ Partnership on AI Welcomes DLA Piper, ELLIS Alicante, MLCommons, Open Library Foundation, and... partnership on aiopen library foundationdla piperellis alicante https://mlcommons.org/working-groups/benchmarks/mobile/ Mobile - MLCommons May 21, 2025 - The Mobile working group creates a set of fair and representative inference benchmarks for mobile consumer devices such as smartphones, tablets, and notebooks... mobilemlcommons https://mlcommons.org/2026/04/mlperf-client-v1-6/ MLCommons Releases MLPerf Client v1.6 with Performance Optimizations and Enhanced User Experience -... MLCommons releases MLPerf Client v1.6 with updated Windows ML and llama.cpp support, Apple MLX improvements for Mac and iPad, and usability enhancements for... mlperf clientperformance optimizationsuser experiencemlcommonsreleases https://mlcommons.org/working-groups/data/datasets/ Datasets - MLCommons Aug 15, 2025 - The Datasets working group creates new datasets to fuel innovation in machine learning. datasetsmlcommons https://mlcommons.org/benchmarks/algorithms/ MLCommons AlgoPerf: Training Algorithms Benchmark Results Nov 20, 2024 - The AlgoPerf: Training Algorithms benchmark measures how much faster we can train neural network models to a given target performance by changing the... benchmark resultsmlcommonstrainingalgorithms https://mlcommons.org/2025/08/storage-2-checkpointing/ Announcing the MLPerf Storage v2.0 Checkpointing Workload - MLCommons Sep 2, 2025 - MLCommons, MLPerf Storage v2.0 Addressing Backup and Recovery Speed for Training Large Language Models on Scale-Out Clusters mlperf storageannouncingv2checkpointingworkload https://mlcommons.org/working-groups/benchmarks/tiny/ MLPerf Tiny - MLCommons Aug 15, 2025 - The MLPerf Tiny working group develops Tiny ML benchmarks to evaluate inference performance on ultra-low-power systems. mlperf tinymlcommons https://mlcommons.org/working-groups/benchmarks/storage/ MLPerf Storage - MLCommons Nov 18, 2024 - The MLPerf Storage working group defines and develops the MLPerf Storage benchmarks to characterize performance of storage systems that support machine... mlperf storagemlcommons https://mlcommons.org/2025/10/ailuminate-jailbreak-v05/ MLCommons Unveils New Jailbreak Benchmark, Quantifying AI’s “Resilience Gap” to Adversarial Attacks... Oct 15, 2025 - MLCommons AI jailbreak benchmark introduces the Resilience Gap metric, the first industry standard to measure AI safety under attack. Learn how it protects... adversarial attacksmlcommonsunveilsnewjailbreak https://mlcommons.org/about-us/programs/ Rising Stars Program - MLCommons Mar 20, 2026 - MLCommons Systems and ML Rising Stars celebrate up-and coming researchers. Our goal is to improve AI systems though our collective engineering efforts with... rising stars programmlcommons https://mlcommons.org/ailuminate/jailbreak/ AILuminate Jailbreak - MLCommons ailuminate jailbreakmlcommons https://mlcommons.org/benchmarks/inference-tiny/ Benchmark MLPerf Inference: Tiny | MLCommons V1.1 Results Sep 17, 2025 - MLPerf Inference: Tiny benchmark suite measures how fast systems can process inputs and produce results using a trained model with v1.1 results. mlperf inferencebenchmarktinymlcommonsv1 https://mlcommons.org/2024/03/mlperf-llama2-70b/ Llama 2 70B: An MLPerf Inference Benchmark for Large Language Models - MLCommons Mar 13, 2025 - The MLPerf Inference v4.0 round adds Llama 2 70B model as the flagship “larger” LLM for its latest benchmark round. large language modelsmlperf inferencellama70bbenchmark https://mlcommons.org/working-groups/benchmarks/automotive/ MLPerf Automotive - MLCommons Aug 27, 2025 - The MLPerf Automotive working group defines and develops an industry standard ML benchmark suite for automotive to be used in request for information (RFIs)... mlperf automotivemlcommons https://mlcommons.org/working-groups/research/science/ Science - MLCommons May 21, 2025 - The Science working group evaluates, organizes, curates, and integrates artifacts around applications, models/algorithms, infrastructure, benchmarks, and... sciencemlcommons https://mlcommons.org/datasets/cognata/ Cognata Dataset | MLCommons Machine Learning Datasets Nov 20, 2024 - The MLCommons Cognata Dataset is a set of photorealistic synthetic automotive data frames of urban and highway scenarios to train machine learning (ML) for... machine learningcognatadatasetmlcommons https://mlcommons.org/benchmarks/storage/ Benchmark MLPerf Storage | MLCommons V1.1 Results Aug 5, 2025 - The MLPerf Storage benchmark suite measures how fast storage systems can supply training data when a model is being trained. Below is a short summary of the... mlperf storagebenchmarkmlcommonsv1results https://mlcommons.org/2026/04/mlperf-inference-v6-0-results/ MLCommons Releases New MLPerf Inference v6.0 Benchmark Results - MLCommons Apr 14, 2026 - MLCommons releases MLPerf Inference v6.0 results — the most significant benchmark update to date, with new tests for text-to-video, GPT-OSS 120B, DLRMv3,... mlperf inferencebenchmark resultsmlcommonsreleasesnew https://mlcommons.org/ailuminate/agentic/ AILuminate Agentic - MLCommons ailuminate agenticmlcommons https://mlcommons.org/ailuminate/safety-technical-users/ Safety Technical Users - MLCommons Oct 15, 2025 - Technical Users This page provides more information about the technical components of the AILuminate benchmark suite, and how they work together to benchmark... safety technical usersmlcommons https://discord.com/invite/rRbEjveNy5 MLCommons Check out the MLCommons community on Discord - hang out with 2976 other members and enjoy free voice and text chat. mlcommons https://mlcommons.org/ailuminate/safety/ AILuminate Safety - MLCommons ailuminate safetymlcommons https://mlcommons.org/datasets/unsupervised-peoples-speech/ People's Speech Dataset | MLCommons Datasets Mar 4, 2025 - The MLCommons People’s Speech Dataset contains 30,000 hours of conversational English speech recognition licensed for academic and commercial machine learning... peoplespeechdatasetmlcommons https://mlcommons.org/benchmarks/inference-edge/ Benchmark MLPerf Inference: Edge | MLCommons V3.1 Results Apr 1, 2026 - The Benchmark MLPerf Inference: Edge benchmark suite measures how fast systems can train models to a target quality metric. mlperf inferencebenchmarkedgemlcommonsv3 https://mlcommons.org/datasets/peoples-speech/ People's Speech Dataset | MLCommons Datasets Nov 20, 2024 - The People’s Speech Dataset contains 30,000 hours of conversational English speech recognition licensed for academic and commercial machine learning usage. peoplespeechdatasetmlcommons https://mlcommons.org/2025/08/mlperf-auto-v0-5-results/ AVCC and MLCommons Release New MLPerf Automotive v0.5 Benchmark Results - MLCommons Sep 2, 2025 - AVCC® and MLCommons® announced new results for their new MLPerf® Automotive v0.5 benchmark mlperf automotivebenchmark resultsmlcommonsreleasenew https://mlcommons.org/2023/03/unlocking-ml-requires-an-ecosystem-approach/ Perspective: Unlocking ML requires an ecosystem approach - MLCommons Mar 13, 2025 - Considerable innovation is taking place across the ML research-to-production life cycle, it often occurs organically and in silos, resulting in uneven impact... ecosystem approachperspectiveunlockingmlrequires https://mlcommons.org/2025/06/aaai2025/ Emerging themes from AAAI 2025: standardisation, evaluation & collaboration in AI safety - MLCommons Jul 1, 2025 - MLCommons AAAI 2025 standardization collaboration evaluation in ai safety emerging themesai safetyaaaistandardisationevaluation https://mlcommons.org/benchmarks/inference-mobile/ Benchmark MLPerf Inference: Mobile | MLCommons V3.1 Results Dec 16, 2025 - The MLPerf Inference: Mobile benchmark suite measures how fast systems can process inputs and produce results using a trained model with v3.1 results. mlperf inferencebenchmarkmobilemlcommonsv3 https://mlcommons.org/ailuminate/ailuminate-multimodal/ AILuminate Multimodal - MLCommons ailuminate multimodalmlcommons https://mlcommons.org/ MLCommons - Better AI for Everyone Apr 14, 2025 - MLCommons aims to accelerate AI innovation to benefit everyone. It's philosophy of open collaboration and collaborative engineering seeks to improve AI systems... ai for everyonemlcommonsbetter https://mlcommons.org/working-groups/benchmarks/inference/ MLPerf Inference - MLCommons Aug 15, 2025 - The MLCommons MLPerf Inference working group creates a set of fair and representative inference benchmarks. The myriad combinations of ML hardware and software... mlperf inferencemlcommons https://mlcommons.org/benchmarks/training/ MLCommons MLPerf Training Benchmark Apr 2, 2026 - The MLPerf Benchmark Suites measures how fast machine learning systems can train models to a target quality metric using v2.0 results. mlperf trainingmlcommonsbenchmark https://mlcommons.org/our-members/ Our Members - MLCommons Apr 29, 2026 - MLCommons is supported by over 125 members and affiliates, including startups, leading companies, academics, and non-profits from around the globe. our membersmlcommons https://mlcommons.org/2026/02/croissant-1-1-standard/ What’s New in Croissant 1.1: Extensible, Agent-Ready ML Dataset Standard - MLCommons Feb 12, 2026 - Croissant 1.1 adds machine-actionable provenance, vocabulary interoperability, and automated governance to the ML dataset standard—making 700K+ datasets... new inagent readycroissantextensibleml https://mlcommons.org/2026/05/deepseek-v3-training-v6-0/ DeepSeek-V3: A Large-Scale MoE Pretraining Benchmark for MLPerf Training v6.0 - MLCommons MLPerf Training v6.0 introduces a large-scale pretraining benchmark built on DeepSeek-V3, bringing Mixture-of-Experts (MoE) evaluation to the suite. deepseek v3a largemlperf training https://mlcommons.org/jobs/ Jobs - MLCommons Jan 22, 2026 - MLCommons is an AI engineering consortium, built on a philosophy of open collaboration to improve AI systems. Our collective engineering efforts span industry... jobsmlcommons https://mlcommons.org/about-us/leadership/ Leadership - MLCommons Dec 11, 2025 - Meet the MLCommons Leadership team. MLCommons is an open AI engineering consortium, Our goal is to improve AI systems though our collective engineering efforts... leadershipmlcommons https://mlcommons.org/ailuminate/jailbreak-faq/ Jailbreak FAQ - MLCommons Feb 16, 2026 - FAQ Interested in learning more? jailbreak faqmlcommons https://mlcommons.org/working-groups/ai-risk-reliability/ai-risk-reliability/ AI Risk & Reliability - MLCommons Feb 19, 2026 - Support community development of AI risk and reliability tests and organize definition of research- and industry-standard AI safety benchmarks based on those... ai riskreliabilitymlcommons https://mlcommons.org/2025/10/croissant-mcp/ Metadata, Meet Datasets: Croissant and MCP in Action - MLCommons Oct 1, 2025 - Metadata, Meet Datasets: Croissant and MCP in Action in actionmetadatameetdatasetscroissant https://mlcommons.org/datasets/ View Datasets provided by MLCommons Dec 4, 2024 - Evaluating AI systems depends on rigorous, standardized test datasets. MLCommons builds open, large-scale, and diverse datasets. View more. view datasetsprovided bymlcommons https://mlcommons.org/ailuminate/safety-faq/ Safety FAQ - MLCommons safety faqmlcommons https://mlcommons.org/working-groups/data/mlcube/ MLCube - MLCommons Nov 20, 2024 - MLCommons MLCube is the shipping container that enables researchers and developers to share the software that powers ML. It is a set of common conventions for... mlcubemlcommons https://mlcommons.org/category/mlperf-training/ MLPerf Training Archives - MLCommons mlperf trainingarchivesmlcommons https://www.chatbench.org/mlcommons-ai-safety-v1-0-benchmarks/ MLCommons AI Safety v1.0 Benchmarks: The Ultimate 12-Hazard Test for 2026 🚦 - ChatBench Feb 2, 2026 - Imagine a world where every AI chatbot you interact with has passed a rigorous, industry-standard safety test—no more unexpected toxic rants… ai safetythe ultimatemlcommonsv1benchmarks https://mlcommons.org/benchmarks/ailuminate/ AILuminate - MLCommons Oct 15, 2025 - The AILuminate benchmark assesses the safety of general chatbot gen AI systems to help guide development, inform purchasers and consumers, and support... ailuminatemlcommons https://mlcommons.org/2025/04/rgat-inference-v5/ Introducing a Graph Neural Network Benchmark in MLPerf Inference v5.0 - MLCommons Sep 4, 2025 - MLCommons announces new RGAT benchmark to MLPerf Inference v5.0 - addresses performance tests for graph-structured data and applications. neural networkmlperf inferenceintroducinggraphbenchmark https://mlcommons.org/benchmarks/mlperf-automotive/ Benchmark MLPerf Autotmotive MLCommons V0.5 Sep 4, 2025 - The MLPerf Automotive benchmark suite measures the performance of computers intended for automotive, both for Advanced Driving Assistance System/Autonomous... benchmarkmlperfmlcommonsv0 https://mlcommons.org/about-us/ About - MLCommons Feb 7, 2025 - MLCommons is an AI engineering consortium, built on a philosophy of open collaboration to improve AI systems. Our collective engineering efforts span industry... about mlcommons https://mlcommons.org/benchmarks/inference-datacenter/ Benchmark MLPerf Inference: Datacenter | MLCommons V3.1 Apr 1, 2026 - The MLPerf Inference: Datacenter benchmark suite measures how fast systems can process inputs and produce results using a trained model. mlperf inferencebenchmarkdatacentermlcommonsv3