@hexgrad on Fast360: "📣 Looking for labeled, high-quality synthetic audio/TTS data 📣 Have you been…"

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

hexgrad

posted an update Jan 6

Post

21146

📣 Looking for labeled, high-quality synthetic audio/TTS data 📣 Have you been or are you currently calling API endpoints from OpenAI, ElevenLabs, etc? Do you have labeled audio data sitting around gathering dust? Let's talk! Join https://discord.gg/QuGxSWBfQy or comment down below.

If your data exceeds quantity & quality thresholds and is approved into the next hexgrad/Kokoro-82M training mix, and you permissively DM me the data under an effective Apache license, then I will DM back the corresponding voicepacks for YOUR data if/when the next Apache-licensed Kokoro base model drops.

What does this mean? If you've been calling closed-source TTS or audio API endpoints to:
- Build voice agents
- Make long-form audio, like audiobooks or podcasts
- Handle customer support, etc
Then YOU can contribute to the training mix and get useful artifacts in return. ❤️

More details at hexgrad/Kokoro-82M#21

hexgrad

Jan 6

•

edited Jan 6

TLDR: 🚨 Trade Offer 🚨
I receive: Synthetic Audio w/ Text Labels
You receive: Trained Voicepacks for an 82M Apache TTS model
Join https://discord.gg/QuGxSWBfQy to discuss

to-be

Jan 9

In what kind of format do you want this?

Alibrown

Jan 6

Hi, i test it today. Nice work. Will be ther german to in future?

hexgrad

Jan 6

It's simple: what you put in is what you get out. 😄 German support in the future depends mostly on how much German data (synthetic audio + text labels) is contributed.

wadmusa

Jan 7

tell me about quantum machanic

kalmuraee

Jan 11

If you are looking for Arabic data, There are Common Voice data , SADA, MASC , MGB-2 , MGB-3 and MGB-5

zhengjian1996

Jan 11

This comment has been hidden

CcyberNinja

Jan 11

This comment has been hidden

debashisdtt169

Jan 12

This comment has been hidden

Cuiunbo

Jan 14

Hi, nice work! do you think it's possible to replace the tts part of the current end-to-end model(https://huggingface.co/openbmb/MiniCPM-o-2_6) with kokoro, which I've heard is the perfect speed and size for end-side devices?

asfberlin

Jan 15

would it be possible to use this dataset to train german : https://huggingface.co/datasets/amphion/Emilia-Dataset/tree/main/DE ?

Omonotomor

Jan 17

This comment has been hidden

tomprimozic

Jan 20

why don't we try to crowdsource actual human voices? What would be the "conditions" (i.e. high-quality, wav encoding, clear pronunciation, no-noise environment, etc.)? I mean, 100 hours isn't that much, especially for a small and free model (basically a gift to mankind)?

jigsaws-stomper

May 7

@hexgrad How many hours of data is needed for a new language and do you train separate model for each language or single model for all ?

doinv

May 29

how to train this model?

paintdog

5 days ago

Support for the German language would be great, because it would open up new possibilities for school, university, and further learning processes. The solutions I’m aware of are either closed-source behind paywalls or simply not good (enough). What is the current planning status of this project? As far as I can see, having a German voice could make a big difference...

In this post