https://reporterslab.org/2024/06/20/what-exactly-is-the-fact-check-insights-dataset/
Jan 21, 2025 - Since its launch in December, the Fact-Check Insights dataset has been downloaded hundreds of times by researchers who are studying misinformation and...
fact checkexactlyinsightsdatasetreporters
https://www.qs.com/solutions/world-university-rankings-dataset
Explore the QS World University Rankings Dataset - uncover what drives your ranking, benchmark performance, and turn insights into strategic advantage.
world university rankingsqsdataset
https://realpython.com/pandas-python-explore-dataset/
In this step-by-step tutorial, you'll learn how to start exploring a dataset with pandas and Python. You'll learn how to access specific rows and...
usingpandaspythonexploredataset
https://terminet-h2020.eu/virtual-reality-gesture-recognition-dataset/
Dec 7, 2023 - Virtual Reality Gesture Recognition Dataset This dataset features information from a smarthome located at Greece, which features the Mediterranean climate. The...
virtual realitygesture recognitiondataset
https://www.naverlabs.com/datasets
네이버랩스의 빠른 성장은 연구자들의 네트워크에 많은 빚을 지고 있습니다. 우리가 가진 최신 데이터를 나눔으로써...
naverlabsopendataset
https://www.datibenecomune.it/2023/11/09/liberiamoli-tutti-newsletter/
Dec 1, 2023 - Liberiamoli tutti! è la newsletter di #datiBeneComune in cui ogni mese liberiamo e presentiamo dati di pubblica utilità
tuttiunazionecuiogni
https://cryptics.georgeho.org/
A dataset of cryptic crossword clues, collected from various blogs and digital archives.
cryptic crossworddatasetclues
https://www.openaire.eu/community/blogs/beyond-the-dataset-four-ideas-that-improve-how-we-streamline-digital-science
This blog was jointly written with Elli Papadopoulou. In Greece and Sweden, emergency departments face a common challenge: patients arriving with ambiguous...
beyonddatasetfourideasimprove
https://link.springer.com/chapter/10.1007/978-3-031-47969-4_44?error=cookies_not_supported&code=57b5f467-d451-4567-bd1c-ca6e1572f95c
In this work we introduce the CitrusFarm dataset, a comprehensive multimodal sensory dataset collected by a wheeled mobile robot operating in agricultural...
multimodaldatasetlocalizationmappingcrop
https://labs.watchtowr.com/all-around-the-world-the-common-crawl-dataset/
At watchTowr, we're big believers that data is power, and ultimately data drives security initiatives - like Attack Surface Management, which we then use...
common crawlaroundworlddataset
https://exposing.ai/wildtrack/
WILDTRACK is a surveillance video dataset of students recorded outside the ETH university main building in Zurich
exposingsevencamerahddataset
https://nestwatch.org/explore-data/nestwatch-open-dataset-downloads/
Jul 10, 2025 - Those seeking to conduct formal analyses using the NestWatch Open Dataset are encouraged to download the raw data available on this page. Note that raw data...
nestwatchopendatasetdownloads
https://docs.scicrunch.io/elasticsearch-metadata-services/sparc-datasets/sparc-dataset-elasticsearch-json-data-model
The data model for results from Elasticsearch is provided below. We provide documentation as "comments" within a trimmed JSON document for a SPARC...
json datasparcdatasetelasticsearchmodel
https://exposing.ai/uccs/
UnConstrained College Students is a dataset of long-range surveillance photos of students on University of Colorado in Colorado Springs campus
college studentsexposingdataset
https://www.graphcore.ai/posts/july-papers-subliminal-learning-mixture-of-recursions-and-dataset-curation
July's Papers review covers attempts to remove undesirable traits from training data, a new approach to recursion, and the use of raters for data curation.
julypaperssubliminallearningmixture
https://www.tgs.com/well-data-products
As one of the leading oil and gas analytics companies, TGS offers over 100 years of public and proprietary subsurface data and management technology. Learn...
energy industryexplorelargestsubsurfacedataset
https://www.trillianthealth.com/analytics/visits-data
Explore patient journeys across the full care continuum, both inpatient and outpatient, with proprietary logic linking disparate claims more accurately.
patientjourneysvisitsdatasethealth
https://www.ilb.eus/es/startup/dataset-2/
Jul 22, 2021 - Sistema de Inteligencia Artificial que permite la analítica de datos en tiempo real en espacios de pública concurrencia sobre recursos humanos,...
datasetbilbao
https://www.qs.com/solutions/regional-rankings-dataset
Unlock regional insights with the QS Regional Rankings Dataset to benchmark performance, brief stakeholders, and strengthen local and global impact.
qsregionalrankingsdataset
https://endpoints.news/protara-trumpets-small-rare-disease-dataset-in-children-for-cell-therapy/
Nov 19, 2025 - Protara Therapeutics on Wednesday revealed new data showing its experimental bladder cancer cell therapy could also have potential in a rare pediatric disease.
rare diseasetrumpetssmalldatasetchildren
https://www.tigerdata.com/blog/how-to-explore-timescaledb-using-simulated-iot-sensor-data
Oct 17, 2024 - In this tutorial, you'll learn how to simulate a basic IoT dataset that you can deploy with PostgreSQL or TimescaleDB.
simulatebasiciotsensordataset
https://www.kaggle.com/datasets/allen-institute-for-ai/CORD-19-research-challenge
Jun 6, 2022 - An AI challenge with AI2, CZI, MSR, Georgetown, NIH & The White House
open researchcoviddatasetchallengecord
https://theirstack.com/en/jobs-dataset
Download a list of 4.8 million companies across 7 thousand technologies and find out what they are using. Enriching any database with accurate data is easy...
jobsdatasetcom
https://distilabel.argilla.io/latest/sections/pipeline_samples/tutorials/clean_existing_dataset/
Distilabel is an AI Feedback (AIF) framework for building datasets with and for LLMs.
cleanexistingpreferencedatasetdocs
https://scale.com/leaderboard/swe_bench_pro_public
Jan 11, 2026 - Explore the SEAL leaderboard with expert-driven LLM benchmarks and updated AI model leaderboards, ranking top models across coding, reasoning and more.
swebenchpropublicdataset
https://huggingface.co/datasets/facebook/sam-3d-body-dataset
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
hugging facefacebooksambodydataset
https://dev.to/shruti_nakum/suppose-there-is-a-dataset-having-variables-with-missing-values-of-more-than-30-how-will-you-deal-476j
Dec 2, 2025 - If a variable has more than 30% missing values, I treat it carefully because that much missing... Tagged with discuss, data, datascience, beginners.
supposedatasetvariablesmissingvalues
https://datahub.thetrace.org/dataset/city-gun-violence-trends/?_gl=1*1jhg1su*_ga*OTI4NDY0MDYzLjE3NjE4NTU2MDk.*_ga_LG1CD8NFZC*czE3NjE4NTU2MDkkbzEkZzAkdDE3NjE4NTU2MDkkajYwJGwwJGgw
Nov 13, 2025 - Aggregated weekly killed and injured victim totals for many U.S. cities, plus fitted trend lines. Based on Gun Violence Archive data.
gun violencetrendscities
https://brightdata.co.kr/pricing/datasets
Aug 31, 2025 - Explore Bright Data's Dataset Marketplace with flexible pricing and refresh options. Access validated data in JSON, CSV, or Parquet, starting at $2.5/1K...
datasetmarketplacepricing
https://www.agendadigitale.eu/sicurezza/privacy/data-act-e-gdpr-come-gestire-i-mixed-dataset-nelliot/
Dec 30, 2025 - Data Act e GDPR per mixed dataset IoT: governance, soluzioni tecniche e contratti per gestire dati personali e non personali in conformità.
data actegdpr
https://discover.data.vic.gov.au/dataset/?sort=score+desc%2C+metadata_modified+desc&q=water+OR+%28groups%3Aenvironment%29&organization=&groups=&res_format=
government datadatasetvictorian
https://openlineage.io/docs/spec/facets/dataset-facets/column_lineage_facet/
Column level lineage provides fine grained information on datasets' dependencies.
columnlevellineagedatasetfacet
https://www.linuxfoundation.org/press/agstack-first-dataset-field-boundaries
New Code Base Hosted By AgStack Will Utilize Machine Learning and Artificial Intelligence to Create, Curate, and Manage Global Field Boundaries Data For Public...
projectbuildworldfirstglobal
https://terminet-h2020.eu/cherry-tree-disease-detection-dataset/
Dec 7, 2023 - This cherry tree disease detection dataset is a multimodal, multi-angle dataset which was constructed for monitoring the growth of cherry trees, including...
cherry treediseasedetectiondataset
https://ourworldindata.org/our-world-in-data-covid-19-testing-dataset-published-by-nature
Oct 9, 2020 - The Our World in Data dataset which has tracked COVID-19 testing across the world since the start of the pandemic has been peer-reviewed and published in the...
worlddatacovidtesting
https://runonless.com/nacfe-announces-messy-middle-dataset-and-workshop/
NACFE has released the dataset from Run on Less – Messy Middle.
runlessannouncesmessymiddle
https://canpath.ca/student-dataset/
Jun 9, 2025 - CanPath’s Student Dataset provides students the unique opportunity to gain hands-on experience working with CanPath data.
studentdatasetcanadianpartnershiphealth
https://docs.ultralytics.com/datasets/pose/dog-pose/
Discover the Dog-Pose dataset for pose detection. Featuring 6,773 training and 1,703 test images, it's a robust dataset for training YOLO11 models.
dogposedatasetultralyticsyolo
https://app.formulabot.com/datasets/madden-2026-player-ratings
Madden 2026 player ratings with personal details, overall rating, archetype, and performance metrics across speed, agility, awareness, blocking, passing,...
player ratingsmaddendatasetanalyzeai
https://www.getty.edu/news/pst-art-releases-largest-ever-dataset-on-climate-impact-of-exhibition-making/
Nov 18, 2025 - Arts organizations across Southern California measured exhibition emissions and waste for the nation’s largest art event.
climate impactpstartreleaseslargest
https://www.gallup.com/analytics/468179/global-wellbeing-initiative-dataset.aspx
Download the recent dataset from the Global Wellbeing Initiative.
globalwellbeinginitiativedataset
https://exposing.ai/brainwash/
Brainwash is a dataset of webcam images taken from the Brainwash Cafe in San Francisco
exposingbrainwashdataset
https://huggingface.co/datasets/huggingface/CADS-dataset
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
hugging facehuggingfacedataset
https://build.nvidia.com/nvidia/cosmos-dataset-search
Accelerate post-training of end-to-end autonomous vehicle stacks with vector search and retrieval for large video datasets.
nvidia nimcosmosdatasetsearchblueprint
https://www.rockrobotic.com/articles/dji-zenmuse-l3-rolling-hills-sample-dataset/
Explore the DJI Zenmuse L3 Rolling Hills LiDAR dataset on ROCK Cloud. View, analyze, and download high-resolution point cloud data with professional...
rolling hillsexploredjidatasetpremium
https://www.qs.com/solutions/sustainability-rankings-dataset
Turn ESG data into action with the QS Sustainability Rankings Dataset - benchmark impact, track progress, and guide sustainable university strategy.
qs sustainability rankingsdataset
https://catalog.data.gov/dataset/
The Home of the U.S. Government's Open Data
datasetcatalog
https://communalytic.org/docs/sampling-a-dataset-by-time-period/
Nov 11, 2024 - This guide will help you sample your dataset by a specific time period (e.g., filtering for records (aka posts) from 2024. This process is useful when working...
data managementsamplingdatasettimeperiod
https://linkdigital.com.au/news/2025/11/ckan-2-12-faster-search-large-dataset-downloads/
Nov 27, 2025 - 15x faster, advanced query filters for CKAN 2.12: Get ready for faster data access, smarter search, and a more reliable CKAN. Read more from Link Digital.
fifteentimesfasterckanlarge
https://www.heap.io/platform/capture
Combining Autocapture, custom tracking, and server-side events, Heap provides the easiest and most comprehensive way to capture the data you need.
heapcapturecompletedataset
https://mlcommons.org/datasets/unsupervised-peoples-speech/
Mar 4, 2025 - The MLCommons People’s Speech Dataset contains 30,000 hours of conversational English speech recognition licensed for academic and commercial machine...
peoplespeechdataset
https://www.gu.se/en/gendip/the-gendip-dataset-on-gender-and-diplomatic-representation
The GenDip dataset maps the extent to which states appoint men, women and other diplomats to different kinds of bilateral ambassador postings between the years...
datasetgenderrepresentationuniversity
https://core.ac.uk/services/dataset
Download millions of research outputs for text and data analysis
coredataset
https://distilabel.argilla.io/latest/sections/how_to_guides/advanced/distiset/
Distilabel is an AI Feedback (AIF) framework for building datasets with and for LLMs.
datasetobjectdocs
https://www.prio.org/data/5
Jul 11, 2023 - An extension to the UCDP/PRIO Armed Conflicts Dataset that provides coordinates and spatial extents yearly for each conflict.
research instituteconflictsitedatasetpeace
https://www.qs.com/solutions/academic-reputation-dataset
Created for research leadership, faculty heads, partnership, planning and strategy teams, the QS Academic Reputation Dataset gives you instant access to...
qsacademicreputationdataset
https://terminet-h2020.eu/peach-tree-disease-detection-dataset/
Dec 7, 2023 - This peach tree disease detection dataset is a multimodal, multi-angle dataset which was constructed for monitoring the growth of peach trees, including stress...
peachtreediseasedetectiondataset
https://sos.noaa.gov/catalog/datasets/
The Science On a Sphere® Dataset Catalog is comprised of datasets from NOAA, NASA, universities, science centers and other organizations. Each dataset entry...
datasetcatalogsciencesphere
https://www.qs.com/solutions/gme-rankings-dataset
Unlock MBA and business master’s insights with the QS GME Rankings Dataset to benchmark performance, showcase impact, and drive strategic growth.
qsgmerankingsdataset
https://www.esa.int/Applications/Observing_the_Earth/Space_for_our_climate/ESA_unveils_longest-ever_dataset_on_forest_biomass
esaunveilslongesteverdataset
https://www.bioworld.com/articles/726560-uk-biobank-delivers-largest-metabolomics-dataset-of-500k-profiles?v=preview
It’s the biological resource that keeps on giving, and now UK Biobank has released the final tranche of data on the levels of 249 metabolites in the blood of...
ukbiobankdeliverslargestmetabolomics
https://huggingface.co/datasets/nvidia/embed-nemotron-dataset-v1
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
hugging facenvidiaembednemotrondataset
https://rankmath.com/kb/dataset-schema/
Apr 4, 2025 - Datasets are more manageable for search engines to discover when information such as names, a description, creator (and more) is provided in the form of
rank mathdatasetschematype
https://discover.data.vic.gov.au/dataset/?sort=score+desc%2C+metadata_modified+desc&q=&organization=&groups=community&res_format=
government datadatasetvictorian
https://cognanous.com/blog/neurips2023-report
I am Hirofumi Tsuruta, in charge of research and development on machine learning at COGNANO. We have presented our work on a large-scale dataset of...
large scaleposterpresentationneuripsdataset