Open Datasets
updated
Updated • 141
• 86
Viewer
• Updated • 1.54k • 26k
• 9.62k
Viewer
• Updated • 69.9k • 185k
• 387
Viewer
• Updated • 2.2M • 9.14k
• 392
Matthijs/cmu-arctic-xvectors
Viewer
• Updated • 7.93k • 22.9k
• 63
parler-tts/libritts-r-filtered-speaker-descriptions
Viewer
• Updated • 359k • 125
• 7
Viewer
• Updated • 860k • 21.1k
• 545
alpindale/two-million-bluesky-posts
Viewer
• Updated • 2.11M • 955
• 201
arimalabs/2.3-million-bluesky-posts
Viewer
• Updated • 2.37M • 34
• 5
Viewer
• Updated • 70k • 94.9k
• 230
Viewer
• Updated • 1.34M • 14.1k
• 30
Viewer
• Updated • 1.12M • 475
• 4
parler-tts/libritts_r_filtered
Viewer
• Updated • 359k • 865
• 21
opendiffusionai/cc12m-cleaned
Viewer
• Updated • 8.53M • 80
• 10
Viewer
• Updated • 31.4k • 671
• 23
Preview
• Updated • 371
• 7
Viewer
• Updated • 61.6M • 94.3k
• 1.16k
parler-tts/mls-eng-speaker-descriptions
Viewer
• Updated • 10.8M • 215
• 11
Viewer
• Updated • 115M • 2.32k
• 102
Updated • 28
• 2
Viewer
• Updated • 602k • 10.8k
• 150
Viewer
• Updated • 4.48B • 64.5k
• 769
Viewer
• Updated • 1.55k • 54
• 4
Viewer
• Updated • 1.26M • 25.1k
• 147
Viewer
• Updated • 59.1k • 226
• 12
keremberke/license-plate-object-detection
Viewer
• Updated • 8.83k • 866
• 36
Updated • 40
• 8
Viewer
• Updated • 98.6k • 1.97k
• 100
nebius/SWE-agent-trajectories
Viewer
• Updated • 80k • 1.65k
• 72
Viewer
• Updated • 3.4k • 8.15k
• 58
cfahlgren1/react-code-instructions
Viewer
• Updated • 74.4k • 206
• 157
DAMO-NLP-SG/multimodal_textbook
Updated • 1.1k
• 154
NovaSky-AI/Sky-T1_data_17k
Viewer
• Updated • 16.4k • 332
• 186
Viewer
• Updated • 5.45B • 8.4k
• 519
Viewer
• Updated • 546M • 14.4k
• 968
hoskinson-center/proof-pile
Viewer
• Updated • 363k • 1.69k
• 63
HuggingFaceFW/fineweb-edu
Viewer
• Updated • 3.5B • 272k
• 997
EleutherAI/the_pile_deduplicated
Viewer
• Updated • 134M • 19.8k
• 110
MohamedRashad/multilingual-tts
Viewer
• Updated • 25.5k • 78
• 47
Viewer
• Updated • 16.4k • 8
• 4
facebook/multilingual_librispeech
Viewer
• Updated • 1.49M • 14.7k
• 174
Viewer
• Updated • 1.25M • 12.2k
• 87
Viewer
• Updated • 2.77M • 4.84k
• 115
Fumika/Wikinews-multilingual
Viewer
• Updated • 15.2k • 23
• 7
ayymen/Weblate-Translations
Viewer
• Updated • 11.7M • 578
• 17
Updated • 381k
• 157
Helsinki-NLP/opus_wikipedia
Viewer
• Updated • 1.75M • 224
• 10
Viewer
• Updated • 3.59M • 19
• 1
MLCommons/unsupervised_peoples_speech
Updated • 20.9k
• 74
HKUSTAudio/Llasa_opensource_speech_data_160k_hours_tokenized
Updated • 308
• 30
Viewer
• Updated • 10k • 8.12k
• 538
Viewer
• Updated • 68.1k • 47.8k
• 22
allenai/RLVR-GSM-MATH-IF-Mixed-Constraints
Viewer
• Updated • 29.9k • 1.39k
• 30
allenai/olmo-2-0325-32b-preference-mix
Updated • 127
• 15
allenai/tulu-3-sft-olmo-2-mixture-0225
Viewer
• Updated • 866k • 999
• 22
Viewer
• Updated • 170M • 24.6k
• 90
Viewer
• Updated • 621M • 12.3k
• 87
Viewer
• Updated • 932 • 35k
• 626
Congliu/Chinese-DeepSeek-R1-Distill-data-110k
Viewer
• Updated • 110k • 778
• 731
Viewer
• Updated • 102k • 316
• 47
Viewer
• Updated • 450k • 13.3k
• 715
Viewer
• Updated • 167M • 3.62k
• 68