|
--- |
|
base_model: mlx-community/Qwen2.5-1.5B-Instruct-8bit |
|
language: |
|
- en |
|
license: apache-2.0 |
|
license_link: https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct/blob/main/LICENSE |
|
pipeline_tag: text-generation |
|
tags: |
|
- chat |
|
- mlx |
|
- brainrot |
|
library_name: mlx |
|
datasets: |
|
- shvn22k/brainrot-dataset |
|
--- |
|
|
|
# DuckyBlender/brainrot-translator-mlx-8bit |
|
|
|
This model can effectively translate Brainrot to English and English to Brainrot, with the correct system prompt. Trained in exactly 1 hour on a 16 GB M1 Pro Macbook (8-core) using MLX. This model actually works, and is probably my first actually useful successful fine-tune. |
|
I trained Qwen2.5 instead of Qwen3, for simplicity because I had issues with Qwen3, probably because of the thinking modes. |
|
|
|
## Information |
|
|
|
This model [DuckyBlender/brainrot-translator-mlx-8bit](https://huggingface.co/DuckyBlender/brainrot-translator-mlx-8bit) was |
|
converted to MLX format from [mlx-community/Qwen2.5-1.5B-Instruct-8bit](https://huggingface.co/mlx-community/Qwen2.5-1.5B-Instruct-8bit) |
|
using mlx-lm version **0.25.2**. |
|
|
|
## Training |
|
```bash |
|
mlx_lm.lora --model "mlx-community/Qwen2.5-3B-8bit" --data data --iters 2500 --max-seq-length 200 --num-layers 16 --batch-size 8 --save-every 25 --wandb diegogpt --train |
|
``` |
|
Peak memory: 6.2GB |
|
Trained tokens: around 1.1M |
|
Dataset is 11326 lines, so trained for ~16 epochs. |
|
Calculation: |
|
Total samples seen = iters × batch size = 11326 × 16 = 181216 |
|
Epochs = total samples ÷ dataset size = 181216 ÷ 11326 ≈ 16 |
|
I've tested the 200th iteration (around 1.5epoch) and it worked similarly well, but here I gave the 2500th iteration. |
|
|
|
## Charts |
|
 |
|
|
|
## Use with mlx |
|
|
|
```bash |
|
pip install mlx-lm |
|
``` |
|
|
|
```python |
|
import mlx.core as mx |
|
from mlx_lm import load, generate |
|
from mlx_lm.sample_utils import make_sampler |
|
|
|
def main(): |
|
print("🧠 Brainrot Translator") |
|
print("=" * 30) |
|
|
|
# Ask user for translation direction |
|
print("Choose translation direction:") |
|
print("1. English to Brainrot") |
|
print("2. Brainrot to English") |
|
|
|
while True: |
|
choice = input("Enter choice (1 or 2): ").strip() |
|
if choice in ["1", "2"]: |
|
break |
|
print("Invalid choice. Please enter 1 or 2.") |
|
|
|
# Set system prompt based on choice |
|
if choice == "1": |
|
system_prompt = "Translate from English to Brainrot" |
|
direction = "English → Brainrot" |
|
else: |
|
system_prompt = "Translate from Brainrot to English" |
|
direction = "Brainrot → English" |
|
|
|
print(f"\nMode: {direction}") |
|
print("Loading model...") |
|
|
|
# Load the model |
|
model, tokenizer = load("DuckyBlender/brainrot-translator-mlx-8bit") |
|
# model, tokenizer = load("mlx-community/Qwen2.5-1.5B-Instruct-8bit", adapter_path="adapters") |
|
|
|
print("Model loaded! Enter text to translate (Ctrl+C to exit)") |
|
print("-" * 50) |
|
|
|
try: |
|
while True: |
|
# Get user input |
|
user_input = input(f"\n[{direction.split(' → ')[0]}]: ").strip() |
|
|
|
if not user_input: |
|
continue |
|
|
|
# Format the prompt |
|
messages = [ |
|
{"role": "system", "content": system_prompt}, |
|
{"role": "user", "content": user_input} |
|
] |
|
|
|
# Apply chat template |
|
prompt = tokenizer.apply_chat_template( |
|
messages, |
|
tokenize=False, |
|
add_generation_prompt=True |
|
) |
|
|
|
sampler = make_sampler(temp=0.5, top_p=0.95) |
|
|
|
print(f"[{direction.split(' → ')[1]}]: ", end="", flush=True) |
|
|
|
# Generate response |
|
response = generate( |
|
model, |
|
tokenizer, |
|
prompt=prompt, |
|
sampler=sampler, |
|
max_tokens=128, |
|
verbose=False |
|
) |
|
|
|
print(response) |
|
|
|
except KeyboardInterrupt: |
|
print("\n\nGoodbye! 👋") |
|
|
|
if __name__ == "__main__": |
|
main() |
|
``` |