Youssofal commited on
Commit
1327e07
·
verified ·
1 Parent(s): aed68a8

Tighten MTPLX section, drop self-link, evergreen heading

Browse files
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -18,9 +18,9 @@ pipeline_tag: text-generation
18
 
19
  # Qwen3.6-27B MTPLX Optimized
20
 
21
- ## MTPLX is released
22
 
23
- This checkpoint runs on **MTPLX** an MLX-native runtime for native Multi-Token-Prediction speculative decoding on Apple Silicon. Up to **2.24× faster decode** at real coding temperatures (`temp=0.6 / top_p=0.95 / top_k=20`), using the model's own built-in MTP heads. No external drafter, no greedy hack, no distribution drift.
24
 
25
  ```bash
26
  pip install mtplx
@@ -29,10 +29,9 @@ mtplx start
29
 
30
  **Project:** [github.com/youssofal/MTPLX](https://github.com/youssofal/MTPLX)
31
 
32
- **MTPLX model fleet on Hugging Face:**
33
 
34
  - [Qwen3.6-27B-MTPLX-Optimized-Speed](https://huggingface.co/Youssofal/Qwen3.6-27B-MTPLX-Optimized-Speed) — 4-bit flagship speed (63 TPS on M5 Max)
35
- - [Qwen3.6-27B-MTPLX-Optimized](https://huggingface.co/Youssofal/Qwen3.6-27B-MTPLX-Optimized) — verified default (GDN8-Speed4 trunk + CyanKiwi INT4 MTP)
36
  - [Qwen3.5-4B-MTPLX-Optimized-Speed](https://huggingface.co/Youssofal/Qwen3.5-4B-MTPLX-Optimized-Speed) — small 4-bit speed-test
37
  - [Qwen3.5-4B-Optimized-MTPLX](https://huggingface.co/Youssofal/Qwen3.5-4B-Optimized-MTPLX) — small 8-bit
38
 
 
18
 
19
  # Qwen3.6-27B MTPLX Optimized
20
 
21
+ ## Run this with MTPLX
22
 
23
+ **MTPLX** is an MLX-native runtime for native Multi-Token-Prediction speculative decoding on Apple Silicon. Up to **2.24× faster decode** at real coding temperatures (`temp=0.6 / top_p=0.95 / top_k=20`) using the model's own built-in MTP heads no external drafter, no greedy hack.
24
 
25
  ```bash
26
  pip install mtplx
 
29
 
30
  **Project:** [github.com/youssofal/MTPLX](https://github.com/youssofal/MTPLX)
31
 
32
+ **Other MTPLX checkpoints:**
33
 
34
  - [Qwen3.6-27B-MTPLX-Optimized-Speed](https://huggingface.co/Youssofal/Qwen3.6-27B-MTPLX-Optimized-Speed) — 4-bit flagship speed (63 TPS on M5 Max)
 
35
  - [Qwen3.5-4B-MTPLX-Optimized-Speed](https://huggingface.co/Youssofal/Qwen3.5-4B-MTPLX-Optimized-Speed) — small 4-bit speed-test
36
  - [Qwen3.5-4B-Optimized-MTPLX](https://huggingface.co/Youssofal/Qwen3.5-4B-Optimized-MTPLX) — small 8-bit
37
 
Free AI Image Generator No sign-up. Instant results. Open Now