nvidia
/

OpenCodeReasoning-Nemotron-7B

Text Generation

text-generation-inference

Model card Files Files and versions

wasiuddina commited on Apr 15

Commit

e1f0cca

·

verified ·

1 Parent(s): f9f5111

Update README.md

Files changed (1) hide show

README.md +2 -5

README.md CHANGED Viewed

@@ -14,11 +14,8 @@ tags:
 ## Model Overview
-Llama-3.1-Nemotron-Ultra-253B-v1 is a large language model (LLM) which is a derivative of [Meta Llama-3.1-405B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct) (AKA the *reference model*). It is a reasoning model that is post trained for reasoning, human chat preferences, and tasks, such as RAG and tool calling. The model supports a context length of 128K tokens. This model fits on a single 8xH100 node for inference.
-Llama-3.1-Nemotron-Ultra-253B-v1 is a model which offers a great tradeoff between model accuracy and efficiency. Efficiency (throughput) directly translates to savings. Using a novel Neural Architecture Search (NAS) approach, we greatly reduce the model’s memory footprint, enabling larger workloads, as well as reducing the number of GPUs required to run the model in a data center environment. This NAS approach enables the selection of a desired point in the accuracy-efficiency tradeoff. Furthermore, by using a novel method to vertically compress the model (see details [here](https://arxiv.org/abs/2503.18908)), it also offers a significant improvement in latency.
-The model underwent a multi-phase post-training process to enhance both its reasoning and non-reasoning capabilities. This includes a supervised fine-tuning stage for Math, Code, Reasoning, Chat, and Tool Calling as well as multiple reinforcement learning (RL) stages using Group Relative Policy Optimization (GRPO) algorithms for reasoning, chat, and instruction-following.
 This model is ready for commercial use.

 ## Model Overview
+OpenCodeReasoning-Distill-Qwen-7B-Instruct is a large language model (LLM) which is a derivative of [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) (AKA the *reference model*).
+It is a reasoning model that is post trained for reasoning while code generation. The model supports a context length of 32K tokens.
 This model is ready for commercial use.