inferencerlabs commited on
Commit
55a922a
·
verified ·
1 Parent(s): f566bd4

Upload complete model

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -7,7 +7,7 @@ tags:
7
  - mlx
8
  base_model: openai/gpt-oss-20b
9
  ---
10
- **See gpt-oss-20b 6.5bit MLX in action - [demonstration video](https://youtube.com/xcreate)**
11
 
12
  *q6.5bit quant typically achieves 1.128 perplexity in our testing which is equivalent to q8.*
13
  | Quantization | Perplexity |
@@ -21,6 +21,6 @@ base_model: openai/gpt-oss-20b
21
  ## Usage Notes
22
 
23
  * Built with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.26
24
- * Peak memory usage: ~17 GB
25
  * Expect ~100 tokens/s
26
- * For more details see [demonstration video](https://youtube.com/xcreate) or visit [OpenAI gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b).
 
7
  - mlx
8
  base_model: openai/gpt-oss-20b
9
  ---
10
+ **See gpt-oss-20b 6.5bit MLX in action - [demonstration video](https://youtu.be/mlpFG8e_fLw)**
11
 
12
  *q6.5bit quant typically achieves 1.128 perplexity in our testing which is equivalent to q8.*
13
  | Quantization | Perplexity |
 
21
  ## Usage Notes
22
 
23
  * Built with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.26
24
+ * Memory usage: ~17 GB
25
  * Expect ~100 tokens/s
26
+ * For more details see [demonstration video](https://youtu.be/mlpFG8e_fLw) or visit [OpenAI gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b).