CASA Collection CASA: Cross-Attention as Self-Attention for Efficient Vision-Language Fusion on long context streaming inputs • 6 items • Updated 6 days ago • 5
ARC-Encoders Collection Pretrained ARC-Encoders and a fine-tuning dataset: context compression for unmodified LLMs. • 7 items • Updated 5 days ago • 4
MoshiVis v0.1 Collection MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs • 9 items • Updated 6 days ago • 23
Hibiki fr-en Collection Hibiki is a model for streaming speech translation , which can run on device! See https://github.com/kyutai-labs/hibiki. • 7 items • Updated 5 days ago • 55
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 16 items • Updated 5 days ago • 242
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Paper • 2512.20605 • Published 6 days ago • 52
Dhara Foundational Models Collection Diffusion Language Models combining deep narrow networks, Canon layers (depthwise causal convolutions), and WSD (Warmup-Stable-Decay) training. • 1 item • Updated 3 days ago • 2
Pivotal Token Search Collection Pivotal Token Search (PTS) identifies tokens in a language model's generation that significantly impact the probability of success • 12 items • Updated 10 days ago • 5
Internal Coherence Maximization Collection Internal Coherence Maximization (ICM): A Label-Free, Unsupervised Training Framework for LLMs • 7 items • Updated Oct 10 • 4
Securade.ai Collection All models, datasets and tools related to https://securade.ai • 3 items • Updated Oct 10 • 3
Beyond Memorization: A Multi-Modal Ordinal Regression Benchmark to Expose Popularity Bias in Vision-Language Models Paper • 2512.21337 • Published 5 days ago • 24