Starstrek's picture

1321 1249

Starstrek

Stars321123

·

Stars321

AI & ML interests

AI

Recent Activity

liked a dataset about 21 hours ago

bigai/TongSIM-Asset

liked a dataset 1 day ago

Fsoft-AIC/RepoExec

liked a model 1 day ago

kyutai/moshiko-pytorch-bf16

View all activity

Organizations

liked a dataset about 21 hours ago

bigai/TongSIM-Asset

Updated 2 days ago • 7.73k • 227

liked a dataset 1 day ago

Fsoft-AIC/RepoExec

Viewer • Updated Jun 23, 2024 • 1.07k • 530 • 5

liked 2 models 1 day ago

kyutai/moshiko-pytorch-bf16

Updated Sep 18, 2024 • 166k • 194

MiniMaxAI/MiniMax-M2.1

Text Generation • 229B • Updated 1 day ago • 60k • • 521

upvoted 8 collections 1 day ago

CASA

CASA: Cross-Attention as Self-Attention for Efficient Vision-Language Fusion on long context streaming inputs • 6 items • Updated 6 days ago • 5

ARC-Encoders

Pretrained ARC-Encoders and a fine-tuning dataset: context compression for unmodified LLMs. • 7 items • Updated 5 days ago • 4

Text-To-Speech

https://kyutai.org/next/tts • 4 items • Updated Sep 11 • 19

Speech-To-Text

https://kyutai.org/next/stt • 6 items • Updated 19 days ago • 19

Helium 1

Helium 1: a modular and multilingual LLM • 9 items • Updated 19 days ago • 11

MoshiVis v0.1

MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs • 9 items • Updated 6 days ago • 23

Hibiki fr-en

Hibiki is a model for streaming speech translation , which can run on device! See https://github.com/kyutai-labs/hibiki. • 7 items • Updated 5 days ago • 55

Moshi v0.1 Release

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 16 items • Updated 5 days ago • 242

upvoted a paper 1 day ago

Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning

Paper • 2512.20605 • Published 6 days ago • 52

upvoted 4 collections 2 days ago

Dhara Foundational Models

Diffusion Language Models combining deep narrow networks, Canon layers (depthwise causal convolutions), and WSD (Warmup-Stable-Decay) training. • 1 item • Updated 3 days ago • 2

Pivotal Token Search

Pivotal Token Search (PTS) identifies tokens in a language model's generation that significantly impact the probability of success • 12 items • Updated 10 days ago • 5

Internal Coherence Maximization

Internal Coherence Maximization (ICM): A Label-Free, Unsupervised Training Framework for LLMs • 7 items • Updated Oct 10 • 4

Securade.ai

All models, datasets and tools related to https://securade.ai • 3 items • Updated Oct 10 • 3

upvoted an article 2 days ago

Article

The Optimal Architecture for Small Language Models

3 days ago

•

54

upvoted a collection 2 days ago

Math

7 items • Updated 17 days ago • 2

upvoted a paper 3 days ago

Beyond Memorization: A Multi-Modal Ordinal Regression Benchmark to Expose Popularity Bias in Vision-Language Models

Paper • 2512.21337 • Published 5 days ago • 24