view reply this works way better than i expected! wow great! So then the next obvious step is avatar animation...somehow. For browser it seems we need gltf avatars with threejs and some kind of phoeneme animation
Deliberation in Latent Space via Differentiable Cache Augmentation Paper • 2412.17747 • Published Dec 23, 2024 • 32
view article Article Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints +2 May 1, 2024 • 80
Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models Paper • 2312.06109 • Published Dec 11, 2023 • 21