https://dev.to/agenthustler/building-a-hate-speech-dataset-with-responsible-web-scraping-1554
Building a Hate Speech Dataset with Responsible Web Scraping - DEV Community
Mar 27, 2026 - Why Build Hate Speech Datasets? AI moderation models are only as good as their training... Tagged with python, webdev, tutorial, programming.
hate speechweb scrapingdev
https://winbuzzer.com/2025/02/01/mlcommons-and-hugging-face-launch-huge-speech-dataset-with-more-than-a-million-hours-of-audio-xcxwbn/
MLCommons And Hugging Face Launch Huge Speech Dataset With More Than A Million Hours Of Audio -...
Feb 1, 2025 - An extensive multilingual speech dataset from MLCommons and Hugging Face offers over one million hours of audio, setting a new standard for AI-driven speech...
hugging facespeech dataset
https://mlcommons.org/datasets/unsupervised-peoples-speech/
People's Speech Dataset | MLCommons Datasets
Mar 4, 2025 - The MLCommons People’s Speech Dataset contains 30,000 hours of conversational English speech recognition licensed for academic and commercial machine...
speech dataset mlcommons
https://mlcommons.org/datasets/peoples-speech/
People's Speech Dataset | MLCommons Datasets
Nov 20, 2024 - The People’s Speech Dataset contains 30,000 hours of conversational English speech recognition licensed for academic and commercial machine learning usage.
speech dataset mlcommons