DuckyBlender
/

brainrot-translator-mlx-8bit

Text Generation

Model card Files Files and versions

DuckyBlender commited on Jul 14

Commit

b7706ec

·

verified ·

1 Parent(s): 44319b2

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -30,12 +30,12 @@ using mlx-lm version **0.25.2**.
 mlx_lm.lora --model "mlx-community/Qwen2.5-3B-8bit" --data data --iters 2500 --max-seq-length 200 --num-layers 16 --batch-size 8 --save-every 25 --wandb diegogpt --train
 ```
 Peak memory: 6.2GB
-Trained tokens: ~1.1M
 Dataset is 11326 lines, so trained for ~16 epochs.
 Calculation:
 Total samples seen = iters × batch size = 11326 × 16 = 181216
 Epochs = total samples ÷ dataset size = 181216 ÷ 11326 ≈ 16
-I've tested the 200th iteration (~1.5epoch) and it worked similarly well, but here I gave the 2500th iteration.
 ## Charts
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317acd6212fce5a3cd793f6/pSn5-gcMzDHXPyD1-PY-d.png)

 mlx_lm.lora --model "mlx-community/Qwen2.5-3B-8bit" --data data --iters 2500 --max-seq-length 200 --num-layers 16 --batch-size 8 --save-every 25 --wandb diegogpt --train
 ```
 Peak memory: 6.2GB
+Trained tokens: around 1.1M
 Dataset is 11326 lines, so trained for ~16 epochs.
 Calculation:
 Total samples seen = iters × batch size = 11326 × 16 = 181216
 Epochs = total samples ÷ dataset size = 181216 ÷ 11326 ≈ 16
+I've tested the 200th iteration (around 1.5epoch) and it worked similarly well, but here I gave the 2500th iteration.
 ## Charts
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6317acd6212fce5a3cd793f6/pSn5-gcMzDHXPyD1-PY-d.png)