18 13 1

Michelle Habonneau

michellehbn

AI & ML interests

None yet

Recent Activity

new activity 5 days ago

openai/whisper-large-v3-turbo:WTF is going on?

upvoted an article 5 days ago

From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels

upvoted a changelog 19 days ago

Introducing HF Jobs: Run scalable compute jobs on Hugging Face

View all activity

Organizations

New activity in openai/whisper-large-v3-turbo 5 days ago

WTF is going on?

#71 opened 4 months ago by

vbarrier

upvoted an article 5 days ago

Article

From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels

and 1 other •

7 days ago

• 35

upvoted a changelog 19 days ago

Changelog

Introducing HF Jobs: Run scalable compute jobs on Hugging Face

25 days ago

• 121

upvoted an article about 1 month ago

Article

Test-Driving the LLMD Inference Engine by ZML 🚀

•

Jul 18

• 23

reacted to erikkaum's post with 🤗 about 1 month ago

Post

2032

We just released native support for @SGLang and @vllm-project in Inference Endpoints 🔥

Inference Endpoints is becoming the central place where you deploy high performance Inference Engines.

And that provides the managed infra for it. Instead of spending weeks configuring infrastructure, managing servers, and debugging deployment issues, you can focus on what matters most: your AI model and your users 🙌

upvoted an article about 1 month ago

Article

Nano-vLLM meets Inference Endpoints

•

Jun 25

• 9

reacted to jsulz's post with 🚀🔥 about 2 months ago

Post

4822

It's been a bit since I took a step back and looked at

xet-team progress to migrate Hugging Face from Git LFS to Xet, but every time I do it boggles the mind.

A month ago there were 5,500 users/orgs on Xet with 150K repos and 4PB. Today?
🤗 700,000 users/orgs
📈 350,000 repos
🚀 15PB

Meanwhile, our migrations have pushed throughput to numbers that are bonkers. In June, we hit upload speeds of 577Gb/s (crossing 500Gb/s for the first time).

These are hard numbers to put into context, but let's try:

The latest run of the Common Crawl from

commoncrawl was 471 TB.

We now have ~32 crawls stored in Xet. At peak upload speed we could move the latest crawl into Xet in about two hours.

We're moving to a new phase in the process, so stay tuned.

This shift in gears means it's also time to roll up our sleeves and look at all the bytes we have and the value we're adding to the community.

I already have some homework from @RichardErkhov to look at the dedupe across their uploads, and I'll be doing the same for other early adopters, big models/datasets, and frequent uploaders (looking at you @bartowski 👀)

Let me know if there's anything you're interested in; happy to dig in!