https://www.ipcc.ch/report/ar3/wg1/chapter-8-model-evaluation/
Model Evaluation — IPCC
model evaluationipcc
https://www.mitre.org/news-insights/news-release/mitre-and-faa-introduce-novel-aerospace-large-language-model-evaluation
MITRE and FAA Introduce Novel Aerospace Large Language Model Evaluation Benchmark | MITRE
Aerospace Language Understanding Evaluation Benchmark Enables Thorough Evaluation of LLMs for Aerospace Tasks
large language modelmitrefaaintroducenovel
https://towardsdatascience.com/beyond-roc-auc-and-ks-gini-coefficient-explained-simply/
The Gini Coefficient: From Lorenz Curves to Model Evaluation | Towards Data Science
Oct 12, 2025 - Understanding how the Gini and Lorenz curves help measure how well a model separates defaulters from non-defaulters.
model evaluationdata scienceginicoefficientlorenz
https://www.prolific.com/model-evaluation
Model evaluation & safety | Prolific
Get quality evaluation data for your AI models. Ensure optimal performance, safety, and fair AI outcomes with Prolific's diverse pool of 200,000+ taskers.
model evaluationsafetyprolific
https://www.metoffice.gov.uk/research/foundation/regional-model-evaluation-and-development/index
Regional model evaluation and development - Met Office
Jun 17, 2019 - We develop next-generation regional modelling systems with kilometre-scale resolution to improve weather and climate prediction accuracy worldwide.
model evaluationmet officeregionaldevelopment
https://layerlens.ai/
LayerLens: Independent AI Model Evaluation | Compare 200+ Models
LayerLens is an independent AI model evaluation platform. Compare 200+ AI models side-by-side with transparent benchmarks, model comparison tools, and...
ai modelindependentevaluationcomparemodels
https://www.manning.com/books/ai-model-evaluation
AI Model Evaluation - Leemay Nassery
De-risk AI models, validate real-world performance, and align output with product goals. Before you trust critical business systems to an AI model, you need to...
ai modelevaluation
https://www.metoffice.gov.uk/research/approach/our-research-staff/regional-model-evaluation-and-development-scientists
Regional model evaluation and development scientists - Met Office
Dec 4, 2018 - Our regional model evaluation and development scientists
model evaluationmet officeregionaldevelopmentscientists
https://towardsdatascience.com/beware-of-unreliable-data-in-model-evaluation-a-llm-prompt-selection-case-study-with-flan-t5-88cfd469d058/
Beware of Unreliable Data in Model Evaluation: A LLM Prompt Selection case study with Flan-T5 |...
Jan 8, 2025 - You may choose suboptimal prompts for your LLM (or make other suboptimal choices via model evaluation) unless you clean your test data
model evaluationcase studybewareunreliabledata
https://wsdot.wa.gov/construction-planning/funding/performance-based-project-evaluation-model
Performance-based project evaluation model | WSDOT
Learn how performance-based project evaluation helps WSDOT compare future projects and make recommendations to the State Legislature for funding projects that...
performancebasedprojectevaluationmodel