Update README.md
Browse files
README.md
CHANGED
|
@@ -30,12 +30,12 @@ using mlx-lm version **0.25.2**.
|
|
| 30 |
mlx_lm.lora --model "mlx-community/Qwen2.5-3B-8bit" --data data --iters 2500 --max-seq-length 200 --num-layers 16 --batch-size 8 --save-every 25 --wandb diegogpt --train
|
| 31 |
```
|
| 32 |
Peak memory: 6.2GB
|
| 33 |
-
Trained tokens:
|
| 34 |
Dataset is 11326 lines, so trained for ~16 epochs.
|
| 35 |
Calculation:
|
| 36 |
Total samples seen = iters × batch size = 11326 × 16 = 181216
|
| 37 |
Epochs = total samples ÷ dataset size = 181216 ÷ 11326 ≈ 16
|
| 38 |
-
I've tested the 200th iteration (
|
| 39 |
|
| 40 |
## Charts
|
| 41 |

|
|
|
|
| 30 |
mlx_lm.lora --model "mlx-community/Qwen2.5-3B-8bit" --data data --iters 2500 --max-seq-length 200 --num-layers 16 --batch-size 8 --save-every 25 --wandb diegogpt --train
|
| 31 |
```
|
| 32 |
Peak memory: 6.2GB
|
| 33 |
+
Trained tokens: around 1.1M
|
| 34 |
Dataset is 11326 lines, so trained for ~16 epochs.
|
| 35 |
Calculation:
|
| 36 |
Total samples seen = iters × batch size = 11326 × 16 = 181216
|
| 37 |
Epochs = total samples ÷ dataset size = 181216 ÷ 11326 ≈ 16
|
| 38 |
+
I've tested the 200th iteration (around 1.5epoch) and it worked similarly well, but here I gave the 2500th iteration.
|
| 39 |
|
| 40 |
## Charts
|
| 41 |

|