https://mlcommons.org/2025/01/mlc-medical-collab-eval-medical-imaging/
MLCommons Medical Working Group Co-authors Book Chapter on Collaborative Evaluation for Medical...
Jan 28, 2025 - MLCommons Medical working group co-authored a chapter for a new book on AI in medical imaging, “Collaborative evaluation for performance assessment of medical...
medical working groupco authorsbook chapter
https://mlcommons.org/research/
Machine Learning Research | MLCommons Research
Mar 18, 2025 - Democratizing machine learning through an open community approach. MLCommons is a community driven research effort.
machine learning researchmlcommons
https://partnershiponai.org/partnership-on-ai-welcomes-dla-piper-ellis-alicante-mlcommons-open-library-foundation-and-windfall-trust-as-partners/
Partnership on AI Welcomes DLA Piper, ELLIS Alicante, MLCommons, Open Library Foundation, and...
partnership on aiopen library foundationdla piperellis alicante
https://mlcommons.org/working-groups/benchmarks/mobile/
Mobile - MLCommons
May 21, 2025 - The Mobile working group creates a set of fair and representative inference benchmarks for mobile consumer devices such as smartphones, tablets, and notebooks...
mobilemlcommons
https://mlcommons.org/2026/04/mlperf-client-v1-6/
MLCommons Releases MLPerf Client v1.6 with Performance Optimizations and Enhanced User Experience -...
MLCommons releases MLPerf Client v1.6 with updated Windows ML and llama.cpp support, Apple MLX improvements for Mac and iPad, and usability enhancements for...
mlperf clientperformance optimizationsuser experiencemlcommonsreleases
https://mlcommons.org/working-groups/data/datasets/
Datasets - MLCommons
Aug 15, 2025 - The Datasets working group creates new datasets to fuel innovation in machine learning.
datasetsmlcommons
https://mlcommons.org/benchmarks/algorithms/
MLCommons AlgoPerf: Training Algorithms Benchmark Results
Nov 20, 2024 - The AlgoPerf: Training Algorithms benchmark measures how much faster we can train neural network models to a given target performance by changing the...
benchmark resultsmlcommonstrainingalgorithms
https://mlcommons.org/2025/08/storage-2-checkpointing/
Announcing the MLPerf Storage v2.0 Checkpointing Workload - MLCommons
Sep 2, 2025 - MLCommons, MLPerf Storage v2.0 Addressing Backup and Recovery Speed for Training Large Language Models on Scale-Out Clusters
mlperf storageannouncingv2checkpointingworkload
https://mlcommons.org/working-groups/benchmarks/tiny/
MLPerf Tiny - MLCommons
Aug 15, 2025 - The MLPerf Tiny working group develops Tiny ML benchmarks to evaluate inference performance on ultra-low-power systems.
mlperf tinymlcommons
https://mlcommons.org/working-groups/benchmarks/storage/
MLPerf Storage - MLCommons
Nov 18, 2024 - The MLPerf Storage working group defines and develops the MLPerf Storage benchmarks to characterize performance of storage systems that support machine...
mlperf storagemlcommons
https://mlcommons.org/2025/10/ailuminate-jailbreak-v05/
MLCommons Unveils New Jailbreak Benchmark, Quantifying AI’s “Resilience Gap” to Adversarial Attacks...
Oct 15, 2025 - MLCommons AI jailbreak benchmark introduces the Resilience Gap metric, the first industry standard to measure AI safety under attack. Learn how it protects...
adversarial attacksmlcommonsunveilsnewjailbreak
https://mlcommons.org/about-us/programs/
Rising Stars Program - MLCommons
Mar 20, 2026 - MLCommons Systems and ML Rising Stars celebrate up-and coming researchers. Our goal is to improve AI systems though our collective engineering efforts with...
rising stars programmlcommons
https://mlcommons.org/ailuminate/jailbreak/
AILuminate Jailbreak - MLCommons
ailuminate jailbreakmlcommons
https://mlcommons.org/benchmarks/inference-tiny/
Benchmark MLPerf Inference: Tiny | MLCommons V1.1 Results
Sep 17, 2025 - MLPerf Inference: Tiny benchmark suite measures how fast systems can process inputs and produce results using a trained model with v1.1 results.
mlperf inferencebenchmarktinymlcommonsv1
https://mlcommons.org/2024/03/mlperf-llama2-70b/
Llama 2 70B: An MLPerf Inference Benchmark for Large Language Models - MLCommons
Mar 13, 2025 - The MLPerf Inference v4.0 round adds Llama 2 70B model as the flagship “larger” LLM for its latest benchmark round.
large language modelsmlperf inferencellama70bbenchmark
https://mlcommons.org/working-groups/benchmarks/automotive/
MLPerf Automotive - MLCommons
Aug 27, 2025 - The MLPerf Automotive working group defines and develops an industry standard ML benchmark suite for automotive to be used in request for information (RFIs)...
mlperf automotivemlcommons
https://mlcommons.org/working-groups/research/science/
Science - MLCommons
May 21, 2025 - The Science working group evaluates, organizes, curates, and integrates artifacts around applications, models/algorithms, infrastructure, benchmarks, and...
sciencemlcommons
https://mlcommons.org/datasets/cognata/
Cognata Dataset | MLCommons Machine Learning Datasets
Nov 20, 2024 - The MLCommons Cognata Dataset is a set of photorealistic synthetic automotive data frames of urban and highway scenarios to train machine learning (ML) for...
machine learningcognatadatasetmlcommons
https://mlcommons.org/benchmarks/storage/
Benchmark MLPerf Storage | MLCommons V1.1 Results
Aug 5, 2025 - The MLPerf Storage benchmark suite measures how fast storage systems can supply training data when a model is being trained. Below is a short summary of the...
mlperf storagebenchmarkmlcommonsv1results
https://mlcommons.org/2026/04/mlperf-inference-v6-0-results/
MLCommons Releases New MLPerf Inference v6.0 Benchmark Results - MLCommons
Apr 14, 2026 - MLCommons releases MLPerf Inference v6.0 results — the most significant benchmark update to date, with new tests for text-to-video, GPT-OSS 120B, DLRMv3,...
mlperf inferencebenchmark resultsmlcommonsreleasesnew
https://mlcommons.org/ailuminate/agentic/
AILuminate Agentic - MLCommons
ailuminate agenticmlcommons
https://mlcommons.org/ailuminate/safety-technical-users/
Safety Technical Users - MLCommons
Oct 15, 2025 - Technical Users This page provides more information about the technical components of the AILuminate benchmark suite, and how they work together to benchmark...
safety technical usersmlcommons
https://discord.com/invite/rRbEjveNy5
MLCommons
Check out the MLCommons community on Discord - hang out with 2976 other members and enjoy free voice and text chat.
mlcommons
https://mlcommons.org/ailuminate/safety/
AILuminate Safety - MLCommons
ailuminate safetymlcommons
https://mlcommons.org/datasets/unsupervised-peoples-speech/
People's Speech Dataset | MLCommons Datasets
Mar 4, 2025 - The MLCommons People’s Speech Dataset contains 30,000 hours of conversational English speech recognition licensed for academic and commercial machine learning...
peoplespeechdatasetmlcommons
https://mlcommons.org/benchmarks/inference-edge/
Benchmark MLPerf Inference: Edge | MLCommons V3.1 Results
Apr 1, 2026 - The Benchmark MLPerf Inference: Edge benchmark suite measures how fast systems can train models to a target quality metric.
mlperf inferencebenchmarkedgemlcommonsv3
https://mlcommons.org/datasets/peoples-speech/
People's Speech Dataset | MLCommons Datasets
Nov 20, 2024 - The People’s Speech Dataset contains 30,000 hours of conversational English speech recognition licensed for academic and commercial machine learning usage.
peoplespeechdatasetmlcommons
https://mlcommons.org/2025/08/mlperf-auto-v0-5-results/
AVCC and MLCommons Release New MLPerf Automotive v0.5 Benchmark Results - MLCommons
Sep 2, 2025 - AVCC® and MLCommons® announced new results for their new MLPerf® Automotive v0.5 benchmark
mlperf automotivebenchmark resultsmlcommonsreleasenew
https://mlcommons.org/2023/03/unlocking-ml-requires-an-ecosystem-approach/
Perspective: Unlocking ML requires an ecosystem approach - MLCommons
Mar 13, 2025 - Considerable innovation is taking place across the ML research-to-production life cycle, it often occurs organically and in silos, resulting in uneven impact...
ecosystem approachperspectiveunlockingmlrequires
https://mlcommons.org/2025/06/aaai2025/
Emerging themes from AAAI 2025: standardisation, evaluation & collaboration in AI safety - MLCommons
Jul 1, 2025 - MLCommons AAAI 2025 standardization collaboration evaluation in ai safety
emerging themesai safetyaaaistandardisationevaluation
https://mlcommons.org/benchmarks/inference-mobile/
Benchmark MLPerf Inference: Mobile | MLCommons V3.1 Results
Dec 16, 2025 - The MLPerf Inference: Mobile benchmark suite measures how fast systems can process inputs and produce results using a trained model with v3.1 results.
mlperf inferencebenchmarkmobilemlcommonsv3
https://mlcommons.org/ailuminate/ailuminate-multimodal/
AILuminate Multimodal - MLCommons
ailuminate multimodalmlcommons
https://mlcommons.org/
MLCommons - Better AI for Everyone
Apr 14, 2025 - MLCommons aims to accelerate AI innovation to benefit everyone. It's philosophy of open collaboration and collaborative engineering seeks to improve AI systems...
ai for everyonemlcommonsbetter
https://mlcommons.org/working-groups/benchmarks/inference/
MLPerf Inference - MLCommons
Aug 15, 2025 - The MLCommons MLPerf Inference working group creates a set of fair and representative inference benchmarks. The myriad combinations of ML hardware and software...
mlperf inferencemlcommons
https://mlcommons.org/benchmarks/training/
MLCommons MLPerf Training Benchmark
Apr 2, 2026 - The MLPerf Benchmark Suites measures how fast machine learning systems can train models to a target quality metric using v2.0 results.
mlperf trainingmlcommonsbenchmark
https://mlcommons.org/our-members/
Our Members - MLCommons
Apr 29, 2026 - MLCommons is supported by over 125 members and affiliates, including startups, leading companies, academics, and non-profits from around the globe.
our membersmlcommons
https://mlcommons.org/2026/02/croissant-1-1-standard/
What’s New in Croissant 1.1: Extensible, Agent-Ready ML Dataset Standard - MLCommons
Feb 12, 2026 - Croissant 1.1 adds machine-actionable provenance, vocabulary interoperability, and automated governance to the ML dataset standard—making 700K+ datasets...
new inagent readycroissantextensibleml
https://mlcommons.org/2026/05/deepseek-v3-training-v6-0/
DeepSeek-V3: A Large-Scale MoE Pretraining Benchmark for MLPerf Training v6.0 - MLCommons
MLPerf Training v6.0 introduces a large-scale pretraining benchmark built on DeepSeek-V3, bringing Mixture-of-Experts (MoE) evaluation to the suite.
deepseek v3a largemlperf training
https://mlcommons.org/jobs/
Jobs - MLCommons
Jan 22, 2026 - MLCommons is an AI engineering consortium, built on a philosophy of open collaboration to improve AI systems. Our collective engineering efforts span industry...
jobsmlcommons
https://mlcommons.org/about-us/leadership/
Leadership - MLCommons
Dec 11, 2025 - Meet the MLCommons Leadership team. MLCommons is an open AI engineering consortium, Our goal is to improve AI systems though our collective engineering efforts...
leadershipmlcommons
https://mlcommons.org/ailuminate/jailbreak-faq/
Jailbreak FAQ - MLCommons
Feb 16, 2026 - FAQ Interested in learning more?
jailbreak faqmlcommons
https://mlcommons.org/working-groups/ai-risk-reliability/ai-risk-reliability/
AI Risk & Reliability - MLCommons
Feb 19, 2026 - Support community development of AI risk and reliability tests and organize definition of research- and industry-standard AI safety benchmarks based on those...
ai riskreliabilitymlcommons
https://mlcommons.org/2025/10/croissant-mcp/
Metadata, Meet Datasets: Croissant and MCP in Action - MLCommons
Oct 1, 2025 - Metadata, Meet Datasets: Croissant and MCP in Action
in actionmetadatameetdatasetscroissant
https://mlcommons.org/datasets/
View Datasets provided by MLCommons
Dec 4, 2024 - Evaluating AI systems depends on rigorous, standardized test datasets. MLCommons builds open, large-scale, and diverse datasets. View more.
view datasetsprovided bymlcommons
https://mlcommons.org/ailuminate/safety-faq/
Safety FAQ - MLCommons
safety faqmlcommons
https://mlcommons.org/working-groups/data/mlcube/
MLCube - MLCommons
Nov 20, 2024 - MLCommons MLCube is the shipping container that enables researchers and developers to share the software that powers ML. It is a set of common conventions for...
mlcubemlcommons
https://mlcommons.org/category/mlperf-training/
MLPerf Training Archives - MLCommons
mlperf trainingarchivesmlcommons
https://www.chatbench.org/mlcommons-ai-safety-v1-0-benchmarks/
MLCommons AI Safety v1.0 Benchmarks: The Ultimate 12-Hazard Test for 2026 🚦 - ChatBench
Feb 2, 2026 - Imagine a world where every AI chatbot you interact with has passed a rigorous, industry-standard safety test—no more unexpected toxic rants…
ai safetythe ultimatemlcommonsv1benchmarks
https://mlcommons.org/benchmarks/ailuminate/
AILuminate - MLCommons
Oct 15, 2025 - The AILuminate benchmark assesses the safety of general chatbot gen AI systems to help guide development, inform purchasers and consumers, and support...
ailuminatemlcommons
https://mlcommons.org/2025/04/rgat-inference-v5/
Introducing a Graph Neural Network Benchmark in MLPerf Inference v5.0 - MLCommons
Sep 4, 2025 - MLCommons announces new RGAT benchmark to MLPerf Inference v5.0 - addresses performance tests for graph-structured data and applications.
neural networkmlperf inferenceintroducinggraphbenchmark
https://mlcommons.org/benchmarks/mlperf-automotive/
Benchmark MLPerf Autotmotive MLCommons V0.5
Sep 4, 2025 - The MLPerf Automotive benchmark suite measures the performance of computers intended for automotive, both for Advanced Driving Assistance System/Autonomous...
benchmarkmlperfmlcommonsv0
https://mlcommons.org/about-us/
About - MLCommons
Feb 7, 2025 - MLCommons is an AI engineering consortium, built on a philosophy of open collaboration to improve AI systems. Our collective engineering efforts span industry...
about mlcommons
https://mlcommons.org/benchmarks/inference-datacenter/
Benchmark MLPerf Inference: Datacenter | MLCommons V3.1
Apr 1, 2026 - The MLPerf Inference: Datacenter benchmark suite measures how fast systems can process inputs and produce results using a trained model.
mlperf inferencebenchmarkdatacentermlcommonsv3