Update README.md
Browse files
README.md
CHANGED
|
@@ -10,6 +10,8 @@ widget:
|
|
| 10 |
|
| 11 |
This model is for debugging. It is randomly initialized with the config from [deepseek-ai/DeepSeek-V3](https://huggingface.co/deepseek-ai/DeepSeek-V3) but is of smaller size.
|
| 12 |
|
|
|
|
|
|
|
| 13 |
Usage:
|
| 14 |
```python
|
| 15 |
import torch
|
|
|
|
| 10 |
|
| 11 |
This model is for debugging. It is randomly initialized with the config from [deepseek-ai/DeepSeek-V3](https://huggingface.co/deepseek-ai/DeepSeek-V3) but is of smaller size.
|
| 12 |
|
| 13 |
+
**Note: This repo does not contain the Multi-Token Prediction (MTP) module as explained [here](https://huggingface.co/deepseek-ai/DeepSeek-V3/blob/main/README_WEIGHTS.md).**
|
| 14 |
+
|
| 15 |
Usage:
|
| 16 |
```python
|
| 17 |
import torch
|