AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
NVIDIA Nemotron 3: Efficient and Open Intelligence
Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
Articles
Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models
-
nvidia/Nemotron-Cascade-8B
Text Generation • 8B • Updated • 1.69k • 40 -
nvidia/Nemotron-Cascade-8B-Thinking
Text Generation • 8B • Updated • 1.13k • 25 -
nvidia/Nemotron-Cascade-14B-Thinking
Text Generation • 15B • Updated • 2.11k • 43 -
nvidia/Nemotron-Cascade-8B-Intermediate-ckpts
Text Generation • Updated • 6
Collection of datasets used in the post-training phase of Nemotron Nano v3.
Collection of RL verifiable data for NeMo Gym
-
nvidia/Nemotron-RL-knowledge-web_search-mcqa
Viewer • Updated • 2.93k • 556 • 4 -
nvidia/Nemotron-RL-agent-workplace_assistant
Viewer • Updated • 1.8k • 468 • 8 -
nvidia/Nemotron-RL-instruction_following
Preview • Updated • 141 • 7 -
nvidia/Nemotron-RL-instruction_following-structured_outputs
Viewer • Updated • 9.95k • 452 • 25
A collection of generative models quantized and optimized for inference with Model Optimizer.
-
nvidia/DeepSeek-R1-0528-NVFP4
Text Generation • 397B • Updated • 8.8k • 39 -
nvidia/DeepSeek-R1-0528-NVFP4-v2
Text Generation • 394B • Updated • 138k • 11 -
nvidia/Kimi-K2-Thinking-NVFP4
Text Generation • Updated • 5.67k • 14 -
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8
Text Generation • 32B • Updated • 309k • 201
A collection related to the Alpamayo-R1 Reasoning VLA.
Framework of PyTorch composable modules for developing physics guided machine learning training pipelines. https://github.com/NVIDIA/physicsnemo
-
Earth2 Inference Demo
🟢4Access JupyterLab for interactive coding
-
DoMINO with Ahmed Body Dataset - Multi-Scale Neural Operator for CFD
🟢3Access JupyterLab for interactive coding
-
Modeling Magnetohydrodynamics with PhysicsNeMo
🟢2Access JupyterLab for interactive coding
-
nvidia/fourcastnet3
Updated • 246 • 8
A collection of great reward models for research and production
-
nvidia/Llama-3.3-Nemotron-70B-Reward-Principle
Text Generation • 71B • Updated • 106 • 5 -
nvidia/Qwen3-Nemotron-32B-GenRM-Principle
Text Generation • 33B • Updated • 831 • 11 -
nvidia/Qwen3-Nemotron-32B-RLBFF
Text Generation • 33B • Updated • 141 • 27 -
nvidia/Qwen3-Nemotron-8B-BRRM
Text Generation • Updated • 735 • 8
Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation
-
nvidia/llama-embed-nemotron-8b
Feature Extraction • 8B • Updated • 77.8k • 98 -
nvidia/omni-embed-nemotron-3b
Feature Extraction • 5B • Updated • 1.21k • 78 -
nvidia/llama-nemoretriever-colembed-3b-v1
Visual Document Retrieval • 4B • Updated • 834 • 67 -
nvidia/llama-nemoretriever-colembed-1b-v1
Visual Document Retrieval • 2B • Updated • 3.5k • 18
NVIDIA Clara Models for Molecular Science
Cosmos Reason 2 is an open, customizable, reasoning vision language model (VLM) for physical AI and robotics
3D-Informed World-Consistent Video Generation with Precise Camera Control
Accelerated models for digital biology by the NVIDIA BioNeMo team. https://www.nvidia.com/en-us/clara/biopharma/
Collection of models for OpenReasoning-Nemotron which are trained on 5M reasoning traces for Math, Code and Science.
-
nvidia/OpenReasoning-Nemotron-1.5B
Text Generation • 2B • Updated • 358 • 51 -
nvidia/OpenReasoning-Nemotron-7B
Text Generation • 8B • Updated • 403 • • 47 -
nvidia/OpenReasoning-Nemotron-14B
Text Generation • 15B • Updated • 477 • 42 -
nvidia/OpenReasoning-Nemotron-32B
Text Generation • 33B • Updated • 1.04k • • 120
Open-weight Audio2Face-3D and Audio2Emotion networks and a sample dataset for training and evaluation
Mamba-Transformer hybrid models
-
nvidia/Nemotron-H-47B-Reasoning-128K
Text Generation • 47B • Updated • 348 • 19 -
nvidia/Nemotron-H-8B-Reasoning-128K
Text Generation • 8B • Updated • 773 • 25 -
nvidia/Nemotron-H-8B-Reasoning-128K-FP8
Text Generation • 8B • Updated • 62 • 12 -
nvidia/Nemotron-H-47B-Reasoning-128K-FP8
Text Generation • 47B • Updated • 78 • 5
Multimodal Large Language Models for Detailed Localized Image and Video Captioning
Models and datasets from "AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset"
Reasoning data for supervised finetuning of LLMs to advance code generation and critique
Benchmarks for evaluating synthetic verifiers like test case generation and code reward models (as found in https://www.arxiv.org/abs/2502.13820).
Multimodal world understanding through reasoning
A suite of image and video tokenizers
A suite of image and video tokenizers
Collection of open, commercial-grade datasets for physical AI developers
⚠️ This collection is archived.
👉 https://huggingface.co/collections/nvidia/cosmos-predict25
We are releasing math instruction models, math reward models, general instruction models, all training datasets, and a math reward benchmark.
Eagle is a family of frontier vision-language models with data-centric strategies. The model supports both HD image and long-context video input.
A series of Hybrid Small Language Models.
A family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks and text-only tasks.
Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models.
NeMo Parakeet ASR Models attain strong speech recognition accuracy while being efficient for inference. Available in CTC and RNN-Transducer variants.
-
nvidia/parakeet-rnnt-1.1b
Automatic Speech Recognition • Updated • 684 • 163 -
nvidia/parakeet-ctc-1.1b
Automatic Speech Recognition • 1B • Updated • 163k • 39 -
nvidia/parakeet-rnnt-0.6b
Automatic Speech Recognition • Updated • 2.34k • 12 -
nvidia/parakeet-ctc-0.6b
Automatic Speech Recognition • 0.6B • Updated • 5.31k • 23
InstructRetro is an autoregressive decoder-only language model (LM) with retrieval-augmented pretraining and instruction tuning.
A collection of models trained with Reinforcement Learning from Human Feedback (RLHF).
Llama3-ChatQA-1.5 models excel at conversational question answering (QA) and retrieval-augmented generation (RAG).
The Nemotron 3 8B Family of models is optimized for building production-ready generative AI applications for the enterprise.
MambaVision: A Hybrid Mamba-Transformer Vision Backbone. Includes both 1K and 21K pretrained models.
-
nvidia/MambaVision-L3-512-21K
Image Classification • 0.7B • Updated • 363 • 54 -
nvidia/MambaVision-L3-256-21K
Image Classification • 0.7B • Updated • 120 • 7 -
nvidia/MambaVision-L2-512-21K
Image Classification • 0.2B • Updated • 167 • 3 -
nvidia/MambaVision-L-21K
Image Classification • 0.2B • Updated • 83 • 4
A family of compressed models obtained via pruning and knowledge distillation
-
nvidia/Mistral-NeMo-Minitron-8B-Base
Text Generation • 8B • Updated • 2.78k • 176 -
nvidia/Mistral-NeMo-Minitron-8B-Instruct
Text Generation • 8B • Updated • 1.93k • 81 -
nvidia/Llama-3_1-Nemotron-51B-Instruct
Text Generation • 52B • Updated • 1.16k • 209 -
nvidia/Llama-3.1-Minitron-4B-Width-Base
Text Generation • 5B • Updated • 1.53k • 193
This is the collection that presents ChatQA-2, a suite of 128K long-context models, that also have exceptional RAG capabilities
Large scale pre-training datasets used in the Nemotron family of models.
Open, Production-ready Enterprise Models
-
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-Base-BF16
Text Generation • 32B • Updated • 3.36k • 79 -
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
Text Generation • 32B • Updated • 161k • 467 -
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8
Text Generation • 32B • Updated • 309k • 201 -
nvidia/Qwen3-Nemotron-235B-A22B-GenRM
Text Generation • 235B • Updated • 186 • 14
Large scale pre-training datasets used in the Nemotron family of models.
Open, Production-ready Enterprise Models. Nvidia Open Model license.
-
nvidia/NVIDIA-Nemotron-Nano-12B-v2
Text Generation • 12B • Updated • 43.5k • • 148 -
nvidia/NVIDIA-Nemotron-Nano-9B-v2
Text Generation • 9B • Updated • 78.2k • 456 -
nvidia/NVIDIA-Nemotron-Nano-9B-v2-Base
Text Generation • 9B • Updated • 90.1k • 42 -
nvidia/NVIDIA-Nemotron-Nano-12B-v2-Base
Text Generation • 12B • Updated • 1.91k • 87
A collection of speculative decoding modules created using Model Optimizer.
-
nvidia/gpt-oss-120b-Eagle3-short-context
Text Generation • Updated • 4.87k • 10 -
nvidia/gpt-oss-120b-Eagle3-long-context
Text Generation • 0.2B • Updated • 12.5k • 53 -
nvidia/gpt-oss-120b-Eagle3-throughput
Text Generation • Updated • 524 • 28 -
nvidia/Qwen3-235B-A22B-Eagle3
Text Generation • 0.3B • Updated • 143 • 8
ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation
A collection of tokenizers, diffusion models, and datasets relevant to the cosmos-drive-dreams platform.
Open, Production-ready Enterprise Models
-
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5
Text Generation • 50B • Updated • 178k • 218 -
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8
Text Generation • 50B • Updated • 1.83k • 23 -
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1
Text Generation • 253B • Updated • 218k • • 342 -
nvidia/Llama-3_3-Nemotron-Super-49B-v1
Text Generation • 50B • Updated • 12.8k • 320
NVIDIA Clara Open Models for medical imaging AI: segment, generate, and reason across CT, MRI, and X-ray. Built on MONAI by NVIDIA.
NVIDIA Clara Models for Biology
Predict2.5
A collection of multilingual, region-specific synthetic persona datasets that support sovereign AI development across many countries and regions.
Ultra-efficient reasoning model! SOTA Accuracy / CoT Length trade-offs
World Foundation Model for Future Prediction
Nemotron reward models. For use in RLHF pipelines and LLM-as-a-Judge
-
nvidia/Llama-3_3-Nemotron-Super-49B-GenRM
Text Generation • 50B • Updated • 171 • 18 -
nvidia/Llama-3_3-Nemotron-Super-49B-GenRM-Multilingual
Text Generation • 50B • Updated • 122 • 6 -
nvidia/Llama-3.3-Nemotron-70B-Reward
Text Generation • 71B • Updated • 573 • 2 -
nvidia/Llama-3.3-Nemotron-70B-Reward-Multilingual
Text Generation • 71B • Updated • 121 • 10
Math and Code reasoning model trained through reinforcement learning (RL)
Joint video-text embedding for physical AI
Math reasoning models trained through reinforcement learning (RL)
Reasoning data for supervised finetuning of LLMs to advance data distillation for competitive coding
-
nvidia/OpenCodeReasoning
Viewer • Updated • 753k • 4.08k • 519 -
OpenCodeReasoning: Advancing Data Distillation for Competitive Coding
Paper • 2504.01943 • Published • 15 -
nvidia/OpenCodeReasoning-Nemotron-7B
Text Generation • 8B • Updated • 177 • • 37 -
nvidia/OpenCodeReasoning-Nemotron-14B
Text Generation • 15B • Updated • 142 • 18
Novel ITS approach for open-ended tasks - No. 1 on Arena Hard on 18 Mar 2025
Multimodal Conditional World Generation for World2World Transfer
World Foundation Model for Future Prediction
SOTA models on Arena Hard and RewardBench as of 1 Oct 2024.
-
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
Text Generation • 71B • Updated • 5.26k • • 2.06k -
nvidia/Llama-3.1-Nemotron-70B-Reward-HF
71B • Updated • 1.42k • 89 -
nvidia/HelpSteer2
Viewer • Updated • 21.4k • 5.19k • 434 -
HelpSteer2-Preference: Complementing Ratings with Preferences
Paper • 2410.01257 • Published • 24
QLIP is a family of image tokenizers with SOTA reconstruction quality and zero-shot image understanding.
LLMs equipped with Dynamic Memory Compression to accelerate generation.
Essential datasets and models for content safety, topic-following, and security guardrails
-
nvidia/Aegis-AI-Content-Safety-Dataset-2.0
Viewer • Updated • 33.4k • 3.56k • 71 -
nvidia/llama-3.1-nemoguard-8b-topic-control
Text Classification • Updated • 603 • 16 -
nvidia/llama-3.1-nemoguard-8b-content-safety
Text Classification • Updated • 526 • 30 -
nvidia/CantTalkAboutThis-Topic-Control-Dataset
Viewer • Updated • 1.09k • 114 • 9
A series of Neural Audio Codecs
Collection of optimized ONNX model checkpoints for NVIDIA RTX GPUs
A collection of models and datasets introduced in "OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data"
A collection of models and datasets relating to SteerLM and HelpSteer.
A collection of multilingual and multitask speech to text models from NVIDIA NeMo 🐤
-
nvidia/canary-1b
Automatic Speech Recognition • Updated • 4.75k • 452 -
nvidia/canary-1b-flash
Automatic Speech Recognition • 0.8B • Updated • 249k • 261 -
nvidia/canary-180m-flash
Automatic Speech Recognition • Updated • 980 • 84 -
Training and Inference Efficiency of Encoder-Decoder Speech Models
Paper • 2503.05931 • Published • 4
A collection of models and datasets introduced in "OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset"
NV-Embed is a generalist embedding model encompassing retrieval, reranking, classification, clustering, STS tasks.
A collection of Mamba-2-based research models with 8B parameters trained on 3.5T tokens for comparison with Transformers.
BigVGAN is a universal neural vocoder that generates audio waveform using mel spectrogram as input.
Enabling 4k resolution for VLMs, CVPR 2025, https://nvlabs.github.io/PS3/
A collection of Foundation Vision Models that combine multiple models (CLIP, DINOv2, SAM, etc.).
Classifier models that can be used in NeMo Curator for labelling/filtering datasets.
Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models
-
nvidia/Nemotron-Cascade-8B
Text Generation • 8B • Updated • 1.69k • 40 -
nvidia/Nemotron-Cascade-8B-Thinking
Text Generation • 8B • Updated • 1.13k • 25 -
nvidia/Nemotron-Cascade-14B-Thinking
Text Generation • 15B • Updated • 2.11k • 43 -
nvidia/Nemotron-Cascade-8B-Intermediate-ckpts
Text Generation • Updated • 6
Open, Production-ready Enterprise Models
-
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-Base-BF16
Text Generation • 32B • Updated • 3.36k • 79 -
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
Text Generation • 32B • Updated • 161k • 467 -
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8
Text Generation • 32B • Updated • 309k • 201 -
nvidia/Qwen3-Nemotron-235B-A22B-GenRM
Text Generation • 235B • Updated • 186 • 14
Collection of datasets used in the post-training phase of Nemotron Nano v3.
Large scale pre-training datasets used in the Nemotron family of models.
Collection of RL verifiable data for NeMo Gym
-
nvidia/Nemotron-RL-knowledge-web_search-mcqa
Viewer • Updated • 2.93k • 556 • 4 -
nvidia/Nemotron-RL-agent-workplace_assistant
Viewer • Updated • 1.8k • 468 • 8 -
nvidia/Nemotron-RL-instruction_following
Preview • Updated • 141 • 7 -
nvidia/Nemotron-RL-instruction_following-structured_outputs
Viewer • Updated • 9.95k • 452 • 25
Open, Production-ready Enterprise Models. Nvidia Open Model license.
-
nvidia/NVIDIA-Nemotron-Nano-12B-v2
Text Generation • 12B • Updated • 43.5k • • 148 -
nvidia/NVIDIA-Nemotron-Nano-9B-v2
Text Generation • 9B • Updated • 78.2k • 456 -
nvidia/NVIDIA-Nemotron-Nano-9B-v2-Base
Text Generation • 9B • Updated • 90.1k • 42 -
nvidia/NVIDIA-Nemotron-Nano-12B-v2-Base
Text Generation • 12B • Updated • 1.91k • 87
A collection of generative models quantized and optimized for inference with Model Optimizer.
-
nvidia/DeepSeek-R1-0528-NVFP4
Text Generation • 397B • Updated • 8.8k • 39 -
nvidia/DeepSeek-R1-0528-NVFP4-v2
Text Generation • 394B • Updated • 138k • 11 -
nvidia/Kimi-K2-Thinking-NVFP4
Text Generation • Updated • 5.67k • 14 -
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8
Text Generation • 32B • Updated • 309k • 201
A collection of speculative decoding modules created using Model Optimizer.
-
nvidia/gpt-oss-120b-Eagle3-short-context
Text Generation • Updated • 4.87k • 10 -
nvidia/gpt-oss-120b-Eagle3-long-context
Text Generation • 0.2B • Updated • 12.5k • 53 -
nvidia/gpt-oss-120b-Eagle3-throughput
Text Generation • Updated • 524 • 28 -
nvidia/Qwen3-235B-A22B-Eagle3
Text Generation • 0.3B • Updated • 143 • 8
A collection related to the Alpamayo-R1 Reasoning VLA.
ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation
Framework of PyTorch composable modules for developing physics guided machine learning training pipelines. https://github.com/NVIDIA/physicsnemo
-
Earth2 Inference Demo
🟢4Access JupyterLab for interactive coding
-
DoMINO with Ahmed Body Dataset - Multi-Scale Neural Operator for CFD
🟢3Access JupyterLab for interactive coding
-
Modeling Magnetohydrodynamics with PhysicsNeMo
🟢2Access JupyterLab for interactive coding
-
nvidia/fourcastnet3
Updated • 246 • 8
A collection of tokenizers, diffusion models, and datasets relevant to the cosmos-drive-dreams platform.
A collection of great reward models for research and production
-
nvidia/Llama-3.3-Nemotron-70B-Reward-Principle
Text Generation • 71B • Updated • 106 • 5 -
nvidia/Qwen3-Nemotron-32B-GenRM-Principle
Text Generation • 33B • Updated • 831 • 11 -
nvidia/Qwen3-Nemotron-32B-RLBFF
Text Generation • 33B • Updated • 141 • 27 -
nvidia/Qwen3-Nemotron-8B-BRRM
Text Generation • Updated • 735 • 8
Open, Production-ready Enterprise Models
-
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5
Text Generation • 50B • Updated • 178k • 218 -
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5-FP8
Text Generation • 50B • Updated • 1.83k • 23 -
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1
Text Generation • 253B • Updated • 218k • • 342 -
nvidia/Llama-3_3-Nemotron-Super-49B-v1
Text Generation • 50B • Updated • 12.8k • 320
Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation
NVIDIA Clara Open Models for medical imaging AI: segment, generate, and reason across CT, MRI, and X-ray. Built on MONAI by NVIDIA.
-
nvidia/llama-embed-nemotron-8b
Feature Extraction • 8B • Updated • 77.8k • 98 -
nvidia/omni-embed-nemotron-3b
Feature Extraction • 5B • Updated • 1.21k • 78 -
nvidia/llama-nemoretriever-colembed-3b-v1
Visual Document Retrieval • 4B • Updated • 834 • 67 -
nvidia/llama-nemoretriever-colembed-1b-v1
Visual Document Retrieval • 2B • Updated • 3.5k • 18
NVIDIA Clara Models for Biology
NVIDIA Clara Models for Molecular Science
Predict2.5
Cosmos Reason 2 is an open, customizable, reasoning vision language model (VLM) for physical AI and robotics
A collection of multilingual, region-specific synthetic persona datasets that support sovereign AI development across many countries and regions.
3D-Informed World-Consistent Video Generation with Precise Camera Control
Ultra-efficient reasoning model! SOTA Accuracy / CoT Length trade-offs
Accelerated models for digital biology by the NVIDIA BioNeMo team. https://www.nvidia.com/en-us/clara/biopharma/
Collection of models for OpenReasoning-Nemotron which are trained on 5M reasoning traces for Math, Code and Science.
-
nvidia/OpenReasoning-Nemotron-1.5B
Text Generation • 2B • Updated • 358 • 51 -
nvidia/OpenReasoning-Nemotron-7B
Text Generation • 8B • Updated • 403 • • 47 -
nvidia/OpenReasoning-Nemotron-14B
Text Generation • 15B • Updated • 477 • 42 -
nvidia/OpenReasoning-Nemotron-32B
Text Generation • 33B • Updated • 1.04k • • 120
World Foundation Model for Future Prediction
Open-weight Audio2Face-3D and Audio2Emotion networks and a sample dataset for training and evaluation
Nemotron reward models. For use in RLHF pipelines and LLM-as-a-Judge
-
nvidia/Llama-3_3-Nemotron-Super-49B-GenRM
Text Generation • 50B • Updated • 171 • 18 -
nvidia/Llama-3_3-Nemotron-Super-49B-GenRM-Multilingual
Text Generation • 50B • Updated • 122 • 6 -
nvidia/Llama-3.3-Nemotron-70B-Reward
Text Generation • 71B • Updated • 573 • 2 -
nvidia/Llama-3.3-Nemotron-70B-Reward-Multilingual
Text Generation • 71B • Updated • 121 • 10
Math and Code reasoning model trained through reinforcement learning (RL)
Mamba-Transformer hybrid models
-
nvidia/Nemotron-H-47B-Reasoning-128K
Text Generation • 47B • Updated • 348 • 19 -
nvidia/Nemotron-H-8B-Reasoning-128K
Text Generation • 8B • Updated • 773 • 25 -
nvidia/Nemotron-H-8B-Reasoning-128K-FP8
Text Generation • 8B • Updated • 62 • 12 -
nvidia/Nemotron-H-47B-Reasoning-128K-FP8
Text Generation • 47B • Updated • 78 • 5
Joint video-text embedding for physical AI
Multimodal Large Language Models for Detailed Localized Image and Video Captioning
Math reasoning models trained through reinforcement learning (RL)
Models and datasets from "AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset"
Reasoning data for supervised finetuning of LLMs to advance data distillation for competitive coding
-
nvidia/OpenCodeReasoning
Viewer • Updated • 753k • 4.08k • 519 -
OpenCodeReasoning: Advancing Data Distillation for Competitive Coding
Paper • 2504.01943 • Published • 15 -
nvidia/OpenCodeReasoning-Nemotron-7B
Text Generation • 8B • Updated • 177 • • 37 -
nvidia/OpenCodeReasoning-Nemotron-14B
Text Generation • 15B • Updated • 142 • 18
Reasoning data for supervised finetuning of LLMs to advance code generation and critique
Novel ITS approach for open-ended tasks - No. 1 on Arena Hard on 18 Mar 2025
Benchmarks for evaluating synthetic verifiers like test case generation and code reward models (as found in https://www.arxiv.org/abs/2502.13820).
Multimodal world understanding through reasoning
Multimodal Conditional World Generation for World2World Transfer
A suite of image and video tokenizers
World Foundation Model for Future Prediction
A suite of image and video tokenizers
SOTA models on Arena Hard and RewardBench as of 1 Oct 2024.
-
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
Text Generation • 71B • Updated • 5.26k • • 2.06k -
nvidia/Llama-3.1-Nemotron-70B-Reward-HF
71B • Updated • 1.42k • 89 -
nvidia/HelpSteer2
Viewer • Updated • 21.4k • 5.19k • 434 -
HelpSteer2-Preference: Complementing Ratings with Preferences
Paper • 2410.01257 • Published • 24
Collection of open, commercial-grade datasets for physical AI developers
QLIP is a family of image tokenizers with SOTA reconstruction quality and zero-shot image understanding.
⚠️ This collection is archived.
👉 https://huggingface.co/collections/nvidia/cosmos-predict25
LLMs equipped with Dynamic Memory Compression to accelerate generation.
We are releasing math instruction models, math reward models, general instruction models, all training datasets, and a math reward benchmark.
Essential datasets and models for content safety, topic-following, and security guardrails
-
nvidia/Aegis-AI-Content-Safety-Dataset-2.0
Viewer • Updated • 33.4k • 3.56k • 71 -
nvidia/llama-3.1-nemoguard-8b-topic-control
Text Classification • Updated • 603 • 16 -
nvidia/llama-3.1-nemoguard-8b-content-safety
Text Classification • Updated • 526 • 30 -
nvidia/CantTalkAboutThis-Topic-Control-Dataset
Viewer • Updated • 1.09k • 114 • 9
Eagle is a family of frontier vision-language models with data-centric strategies. The model supports both HD image and long-context video input.
A series of Neural Audio Codecs
A series of Hybrid Small Language Models.
Collection of optimized ONNX model checkpoints for NVIDIA RTX GPUs
A family of frontier-class multimodal large language models (LLMs) that achieve state-of-the-art results on vision-language tasks and text-only tasks.
A collection of models and datasets introduced in "OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data"
Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models.
A collection of models and datasets relating to SteerLM and HelpSteer.
NeMo Parakeet ASR Models attain strong speech recognition accuracy while being efficient for inference. Available in CTC and RNN-Transducer variants.
-
nvidia/parakeet-rnnt-1.1b
Automatic Speech Recognition • Updated • 684 • 163 -
nvidia/parakeet-ctc-1.1b
Automatic Speech Recognition • 1B • Updated • 163k • 39 -
nvidia/parakeet-rnnt-0.6b
Automatic Speech Recognition • Updated • 2.34k • 12 -
nvidia/parakeet-ctc-0.6b
Automatic Speech Recognition • 0.6B • Updated • 5.31k • 23
A collection of multilingual and multitask speech to text models from NVIDIA NeMo 🐤
-
nvidia/canary-1b
Automatic Speech Recognition • Updated • 4.75k • 452 -
nvidia/canary-1b-flash
Automatic Speech Recognition • 0.8B • Updated • 249k • 261 -
nvidia/canary-180m-flash
Automatic Speech Recognition • Updated • 980 • 84 -
Training and Inference Efficiency of Encoder-Decoder Speech Models
Paper • 2503.05931 • Published • 4
InstructRetro is an autoregressive decoder-only language model (LM) with retrieval-augmented pretraining and instruction tuning.
A collection of models and datasets introduced in "OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset"
A collection of models trained with Reinforcement Learning from Human Feedback (RLHF).
NV-Embed is a generalist embedding model encompassing retrieval, reranking, classification, clustering, STS tasks.
Llama3-ChatQA-1.5 models excel at conversational question answering (QA) and retrieval-augmented generation (RAG).
A collection of Mamba-2-based research models with 8B parameters trained on 3.5T tokens for comparison with Transformers.
The Nemotron 3 8B Family of models is optimized for building production-ready generative AI applications for the enterprise.
BigVGAN is a universal neural vocoder that generates audio waveform using mel spectrogram as input.
MambaVision: A Hybrid Mamba-Transformer Vision Backbone. Includes both 1K and 21K pretrained models.
-
nvidia/MambaVision-L3-512-21K
Image Classification • 0.7B • Updated • 363 • 54 -
nvidia/MambaVision-L3-256-21K
Image Classification • 0.7B • Updated • 120 • 7 -
nvidia/MambaVision-L2-512-21K
Image Classification • 0.2B • Updated • 167 • 3 -
nvidia/MambaVision-L-21K
Image Classification • 0.2B • Updated • 83 • 4
Enabling 4k resolution for VLMs, CVPR 2025, https://nvlabs.github.io/PS3/
A family of compressed models obtained via pruning and knowledge distillation
-
nvidia/Mistral-NeMo-Minitron-8B-Base
Text Generation • 8B • Updated • 2.78k • 176 -
nvidia/Mistral-NeMo-Minitron-8B-Instruct
Text Generation • 8B • Updated • 1.93k • 81 -
nvidia/Llama-3_1-Nemotron-51B-Instruct
Text Generation • 52B • Updated • 1.16k • 209 -
nvidia/Llama-3.1-Minitron-4B-Width-Base
Text Generation • 5B • Updated • 1.53k • 193
A collection of Foundation Vision Models that combine multiple models (CLIP, DINOv2, SAM, etc.).
This is the collection that presents ChatQA-2, a suite of 128K long-context models, that also have exceptional RAG capabilities
Classifier models that can be used in NeMo Curator for labelling/filtering datasets.
Large scale pre-training datasets used in the Nemotron family of models.