wasiuddina commited on
Commit
3d4634a
·
verified ·
1 Parent(s): 7eab0cb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -11
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  base_model:
3
- - Qwen/Qwen2.5-7B-Instruct
4
  datasets:
5
  - nvidia/OpenCodeReasoning
6
  language:
@@ -13,10 +13,10 @@ tags:
13
  pipeline_tag: text-generation
14
  ---
15
 
16
- # OpenCodeReasoning-Nemotron-7B-v1.1 Overview
17
 
18
  ## Description: <br>
19
- OpenCodeReasoning-Nemotron-7B-v1.1 is a large language model (LLM) which is a derivative of Qwen2.5-7B-Instruct (AKA the reference model). It is a reasoning model that is post-trained for reasoning for code generation. The model supports a context length of 64k tokens. <br>
20
 
21
  This model is ready for commercial/non-commercial use. <br>
22
 
@@ -73,7 +73,7 @@ To run inference on coding problems:
73
  import transformers
74
  import torch
75
 
76
- model_id = "nvidia/OpenCodeReasoning-Nemotron-7B-v1.1"
77
 
78
  pipeline = transformers.pipeline(
79
  "text-generation",
@@ -128,10 +128,10 @@ If you find the data useful, please cite:
128
 
129
  ## Model Architecture: <br>
130
  Architecture Type: Dense decoder-only Transformer model
131
- Network Architecture: Qwen-7B-Instruct
132
  <br>
133
- **This model was developed based on Qwen2.5-7B-Instruct and has 7B model parameters. <br>**
134
- **OpenCodeReasoning-Nemotron-7B was developed based on Qwen2.5-7B-Instruct and has 7B model parameters. <br>**
135
 
136
  ## Input: <br>
137
  **Input Type(s):** Text <br>
@@ -165,21 +165,21 @@ OpenCodeReasoning-Nemotron-32B-v1.1<br>
165
 
166
  ## Training Dataset:
167
 
168
- The training corpus for OpenCodeReasoning-Nemotron-7B-v1.1 is [OpenCodeReasoning](https://huggingface.co/datasets/nvidia/OpenCodeReasoning) dataset, which is composed of competitive programming questions and DeepSeek-R1 generated responses.
169
 
170
  Data Collection Method: Hybrid: Automated, Human, Synthetic <br>
171
  Labeling Method: Hybrid: Automated, Human, Synthetic <br>
172
  Properties: 1.165M samples from OpenCodeReasoning (https://huggingface.co/datasets/nvidia/OpenCodeReasoning)
173
 
174
  ## Evaluation Dataset:
175
- We used the datasets listed in the next section to evaluate OpenCodeReasoning-Nemotron-7B-v1.1. <br>
176
  **Data Collection Method: Hybrid: Automated, Human, Synthetic <br>**
177
  **Labeling Method: Hybrid: Automated, Human, Synthetic <br>**
178
 
179
 
180
 
181
  ### License/Terms of Use: <br>
182
- GOVERNING TERMS: Use of this model is governed by [Apache 2.0](https://huggingface.co/nvidia/OpenCodeReasoning-Nemotron-7B-v1.1/blob/main/LICENSE).
183
 
184
  ### Deployment Geography:
185
  Global<br>
@@ -188,7 +188,7 @@ Global<br>
188
  This model is intended for developers and researchers building LLMs. <br>
189
 
190
  ### Release Date: <br>
191
- Huggingface [06/20/2025] via https://huggingface.co/nvidia/OpenCodeReasoning-Nemotron-7B-v1.1/ <br>
192
 
193
  ## Reference(s):
194
  [2504.01943] OpenCodeReasoning: Advancing Data Distillation for Competitive Coding
 
1
  ---
2
  base_model:
3
+ - Qwen/Qwen2.5-32B-Instruct
4
  datasets:
5
  - nvidia/OpenCodeReasoning
6
  language:
 
13
  pipeline_tag: text-generation
14
  ---
15
 
16
+ # OpenCodeReasoning-Nemotron-32B-v1.1 Overview
17
 
18
  ## Description: <br>
19
+ OpenCodeReasoning-Nemotron-32B-v1.1 is a large language model (LLM) which is a derivative of Qwen2.5-32B-Instruct (AKA the reference model). It is a reasoning model that is post-trained for reasoning for code generation. The model supports a context length of 64k tokens. <br>
20
 
21
  This model is ready for commercial/non-commercial use. <br>
22
 
 
73
  import transformers
74
  import torch
75
 
76
+ model_id = "nvidia/OpenCodeReasoning-Nemotron-32B-v1.1"
77
 
78
  pipeline = transformers.pipeline(
79
  "text-generation",
 
128
 
129
  ## Model Architecture: <br>
130
  Architecture Type: Dense decoder-only Transformer model
131
+ Network Architecture: Qwen-32B-Instruct
132
  <br>
133
+ **This model was developed based on Qwen2.5-32B-Instruct and has 32B model parameters. <br>**
134
+ **OpenCodeReasoning-Nemotron-32B was developed based on Qwen2.5-32B-Instruct and has 32B model parameters. <br>**
135
 
136
  ## Input: <br>
137
  **Input Type(s):** Text <br>
 
165
 
166
  ## Training Dataset:
167
 
168
+ The training corpus for OpenCodeReasoning-Nemotron-32B-v1.1 is [OpenCodeReasoning](https://huggingface.co/datasets/nvidia/OpenCodeReasoning) dataset, which is composed of competitive programming questions and DeepSeek-R1 generated responses.
169
 
170
  Data Collection Method: Hybrid: Automated, Human, Synthetic <br>
171
  Labeling Method: Hybrid: Automated, Human, Synthetic <br>
172
  Properties: 1.165M samples from OpenCodeReasoning (https://huggingface.co/datasets/nvidia/OpenCodeReasoning)
173
 
174
  ## Evaluation Dataset:
175
+ We used the datasets listed in the next section to evaluate OpenCodeReasoning-Nemotron-32B-v1.1. <br>
176
  **Data Collection Method: Hybrid: Automated, Human, Synthetic <br>**
177
  **Labeling Method: Hybrid: Automated, Human, Synthetic <br>**
178
 
179
 
180
 
181
  ### License/Terms of Use: <br>
182
+ GOVERNING TERMS: Use of this model is governed by [Apache 2.0](https://huggingface.co/nvidia/OpenCodeReasoning-Nemotron-32B-v1.1/blob/main/LICENSE).
183
 
184
  ### Deployment Geography:
185
  Global<br>
 
188
  This model is intended for developers and researchers building LLMs. <br>
189
 
190
  ### Release Date: <br>
191
+ Huggingface [06/20/2025] via https://huggingface.co/nvidia/OpenCodeReasoning-Nemotron-32B-v1.1/ <br>
192
 
193
  ## Reference(s):
194
  [2504.01943] OpenCodeReasoning: Advancing Data Distillation for Competitive Coding