whitphx HF Staff commited on
Commit
658ebc9
·
verified ·
1 Parent(s): 51c6d02

Add/update the quantized ONNX model files and README.md for Transformers.js v3

Browse files

## Applied Quantizations

### ✅ Based on `decoder_model_merged.onnx` *with* slimming

**The base model `decoder_model_merged.onnx` has been renamed to `model.onnx`.**

↳ ✅ `fp16`: `model_fp16.onnx` (added)
↳ ✅ `int8`: `model_int8.onnx` (added)
↳ ✅ `uint8`: `model_uint8.onnx` (added)
↳ ✅ `q4`: `model_q4.onnx` (added)
↳ ✅ `q4f16`: `model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `model_bnb4.onnx` (added)

README.md CHANGED
@@ -5,4 +5,20 @@ library_name: transformers.js
5
 
6
  https://huggingface.co/hf-internal-testing/tiny-random-StableLmForCausalLM with ONNX weights to be compatible with Transformers.js.
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
 
5
 
6
  https://huggingface.co/hf-internal-testing/tiny-random-StableLmForCausalLM with ONNX weights to be compatible with Transformers.js.
7
 
8
+ ## Usage (Transformers.js)
9
+
10
+ If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
11
+ ```bash
12
+ npm i @huggingface/transformers
13
+ ```
14
+
15
+ **Example:** Text generation.
16
+
17
+ ```js
18
+ import { pipeline } from '@huggingface/transformers';
19
+
20
+ const generator = await pipeline('text-generation', 'Xenova/tiny-random-StableLmForCausalLM');
21
+ const output = await generator('Once upon a time, there was', { max_new_tokens: 10 });
22
+ ```
23
+
24
  Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
onnx/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:76ef22e47352a5d71bec6639483af2b39345e80e4f8d89387985f717a4b7705b
3
+ size 800113
onnx/model_bnb4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:671b22683e17ac373591cb69497d615a0586a20d07c0c3f67dea06c79684489d
3
+ size 402462
onnx/model_fp16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ee31450e777d1a233a7fa96fce02cb22b99e6baf5751bd3e233ca60ef9df3c19
3
+ size 424124
onnx/model_int8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:78eeacc68e7bc4794f263504fd544fceaa40df1bfaf5626848902694a12e561c
3
+ size 269064
onnx/model_q4.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dea4d3105474e9545c16fb2bbca5ad8638a5b9c10345487350a33d2298852561
3
+ size 411544
onnx/model_q4f16.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b03c764f4e57be8cc7d7f9cb2e836e61e13bde25dcacf613bca40fd54d3e2809
3
+ size 266509
onnx/model_uint8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2a03de3ad53431bebe01ac02fee2e0929609bf07e95c8ce5b126d8c3e675ea6a
3
+ size 269072