DuckyBlender commited on
Commit
b7706ec
·
verified ·
1 Parent(s): 44319b2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -30,12 +30,12 @@ using mlx-lm version **0.25.2**.
30
  mlx_lm.lora --model "mlx-community/Qwen2.5-3B-8bit" --data data --iters 2500 --max-seq-length 200 --num-layers 16 --batch-size 8 --save-every 25 --wandb diegogpt --train
31
  ```
32
  Peak memory: 6.2GB
33
- Trained tokens: ~1.1M
34
  Dataset is 11326 lines, so trained for ~16 epochs.
35
  Calculation:
36
  Total samples seen = iters × batch size = 11326 × 16 = 181216
37
  Epochs = total samples ÷ dataset size = 181216 ÷ 11326 ≈ 16
38
- I've tested the 200th iteration (~1.5epoch) and it worked similarly well, but here I gave the 2500th iteration.
39
 
40
  ## Charts
41
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317acd6212fce5a3cd793f6/pSn5-gcMzDHXPyD1-PY-d.png)
 
30
  mlx_lm.lora --model "mlx-community/Qwen2.5-3B-8bit" --data data --iters 2500 --max-seq-length 200 --num-layers 16 --batch-size 8 --save-every 25 --wandb diegogpt --train
31
  ```
32
  Peak memory: 6.2GB
33
+ Trained tokens: around 1.1M
34
  Dataset is 11326 lines, so trained for ~16 epochs.
35
  Calculation:
36
  Total samples seen = iters × batch size = 11326 × 16 = 181216
37
  Epochs = total samples ÷ dataset size = 181216 ÷ 11326 ≈ 16
38
+ I've tested the 200th iteration (around 1.5epoch) and it worked similarly well, but here I gave the 2500th iteration.
39
 
40
  ## Charts
41
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317acd6212fce5a3cd793f6/pSn5-gcMzDHXPyD1-PY-d.png)