Update README.md
Browse files
README.md
CHANGED
@@ -30,12 +30,12 @@ using mlx-lm version **0.25.2**.
|
|
30 |
mlx_lm.lora --model "mlx-community/Qwen2.5-3B-8bit" --data data --iters 2500 --max-seq-length 200 --num-layers 16 --batch-size 8 --save-every 25 --wandb diegogpt --train
|
31 |
```
|
32 |
Peak memory: 6.2GB
|
33 |
-
Trained tokens:
|
34 |
Dataset is 11326 lines, so trained for ~16 epochs.
|
35 |
Calculation:
|
36 |
Total samples seen = iters × batch size = 11326 × 16 = 181216
|
37 |
Epochs = total samples ÷ dataset size = 181216 ÷ 11326 ≈ 16
|
38 |
-
I've tested the 200th iteration (
|
39 |
|
40 |
## Charts
|
41 |

|
|
|
30 |
mlx_lm.lora --model "mlx-community/Qwen2.5-3B-8bit" --data data --iters 2500 --max-seq-length 200 --num-layers 16 --batch-size 8 --save-every 25 --wandb diegogpt --train
|
31 |
```
|
32 |
Peak memory: 6.2GB
|
33 |
+
Trained tokens: around 1.1M
|
34 |
Dataset is 11326 lines, so trained for ~16 epochs.
|
35 |
Calculation:
|
36 |
Total samples seen = iters × batch size = 11326 × 16 = 181216
|
37 |
Epochs = total samples ÷ dataset size = 181216 ÷ 11326 ≈ 16
|
38 |
+
I've tested the 200th iteration (around 1.5epoch) and it worked similarly well, but here I gave the 2500th iteration.
|
39 |
|
40 |
## Charts
|
41 |

|