Add/update the quantized ONNX model files and README.md for Transformers.js v3
Browse files## Applied Quantizations
### ✅ Based on `decoder_model.onnx` *with* slimming
↳ ✅ `fp16`: `decoder_model_fp16.onnx` (added)
↳ ✅ `int8`: `decoder_model_int8.onnx` (added)
↳ ✅ `uint8`: `decoder_model_uint8.onnx` (added)
↳ ✅ `q4`: `decoder_model_q4.onnx` (added)
↳ ✅ `q4f16`: `decoder_model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `decoder_model_bnb4.onnx` (added)
### ✅ Based on `decoder_model.onnx` *with* slimming
↳ ✅ `fp16`: `decoder_model_fp16.onnx` (added)
↳ ✅ `int8`: `decoder_model_int8.onnx` (added)
↳ ✅ `uint8`: `decoder_model_uint8.onnx` (added)
↳ ✅ `q4`: `decoder_model_q4.onnx` (added)
↳ ✅ `q4f16`: `decoder_model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `decoder_model_bnb4.onnx` (added)
### ✅ Based on `encoder_model.onnx` *with* slimming
↳ ✅ `int8`: `encoder_model_int8.onnx` (added)
↳ ✅ `uint8`: `encoder_model_uint8.onnx` (added)
↳ ✅ `q4`: `encoder_model_q4.onnx` (added)
↳ ✅ `q4f16`: `encoder_model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `encoder_model_bnb4.onnx` (added)
### ✅ Based on `encoder_model.onnx` *with* slimming
↳ ✅ `int8`: `encoder_model_int8.onnx` (added)
↳ ✅ `uint8`: `encoder_model_uint8.onnx` (added)
↳ ✅ `q4`: `encoder_model_q4.onnx` (added)
↳ ✅ `q4f16`: `encoder_model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `encoder_model_bnb4.onnx` (added)
### ✅ Based on `decoder_with_past_model.onnx` *with* slimming
↳ ✅ `fp16`: `decoder_with_past_model_fp16.onnx` (added)
↳ ✅ `int8`: `decoder_with_past_model_int8.onnx` (added)
↳ ✅ `uint8`: `decoder_with_past_model_uint8.onnx` (added)
↳ ✅ `q4`: `decoder_with_past_model_q4.onnx` (added)
↳ ✅ `q4f16`: `decoder_with_past_model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `decoder_with_past_model_bnb4.onnx` (added)
### ✅ Based on `decoder_with_past_model.onnx` *with* slimming
↳ ✅ `fp16`: `decoder_with_past_model_fp16.onnx` (added)
↳ ✅ `int8`: `decoder_with_past_model_int8.onnx` (added)
↳ ✅ `uint8`: `decoder_with_past_model_uint8.onnx` (added)
↳ ✅ `q4`: `decoder_with_past_model_q4.onnx` (added)
↳ ✅ `q4f16`: `decoder_with_past_model_q4f16.onnx` (added)
↳ ✅ `bnb4`: `decoder_with_past_model_bnb4.onnx` (added)
- README.md +18 -0
- onnx/decoder_model_bnb4.onnx +3 -0
- onnx/decoder_model_fp16.onnx +3 -0
- onnx/decoder_model_int8.onnx +3 -0
- onnx/decoder_model_q4.onnx +3 -0
- onnx/decoder_model_q4f16.onnx +3 -0
- onnx/decoder_model_uint8.onnx +3 -0
- onnx/decoder_with_past_model_bnb4.onnx +3 -0
- onnx/decoder_with_past_model_fp16.onnx +3 -0
- onnx/decoder_with_past_model_int8.onnx +3 -0
- onnx/decoder_with_past_model_q4.onnx +3 -0
- onnx/decoder_with_past_model_q4f16.onnx +3 -0
- onnx/decoder_with_past_model_uint8.onnx +3 -0
- onnx/encoder_model_bnb4.onnx +3 -0
- onnx/encoder_model_int8.onnx +3 -0
- onnx/encoder_model_q4.onnx +3 -0
- onnx/encoder_model_q4f16.onnx +3 -0
- onnx/encoder_model_uint8.onnx +3 -0
@@ -5,4 +5,22 @@ library_name: transformers.js
|
|
5 |
|
6 |
https://huggingface.co/MBZUAI/LaMini-Flan-T5-783M with ONNX weights to be compatible with Transformers.js.
|
7 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|
|
|
5 |
|
6 |
https://huggingface.co/MBZUAI/LaMini-Flan-T5-783M with ONNX weights to be compatible with Transformers.js.
|
7 |
|
8 |
+
## Usage (Transformers.js)
|
9 |
+
|
10 |
+
If you haven't already, you can install the [Transformers.js](https://huggingface.co/docs/transformers.js) JavaScript library from [NPM](https://www.npmjs.com/package/@huggingface/transformers) using:
|
11 |
+
```bash
|
12 |
+
npm i @huggingface/transformers
|
13 |
+
```
|
14 |
+
|
15 |
+
**Example:** Text-to-text generation.
|
16 |
+
|
17 |
+
```js
|
18 |
+
import { pipeline } from '@huggingface/transformers';
|
19 |
+
|
20 |
+
const generator = await pipeline('text2text-generation', 'Xenova/LaMini-Flan-T5-783M');
|
21 |
+
const output = await generator('how can I become more healthy?', {
|
22 |
+
max_new_tokens: 100,
|
23 |
+
});
|
24 |
+
```
|
25 |
+
|
26 |
Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web-ready, we recommend converting to ONNX using [🤗 Optimum](https://huggingface.co/docs/optimum/index) and structuring your repo like this one (with ONNX weights located in a subfolder named `onnx`).
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b99168f6cfb8bb904003d6e1c6c9b65b748fb19699c5c4e5b355c3b0bc5bb681
|
3 |
+
size 381134914
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:25f33f4ff6a63a6720beaffd3f910963fc5696e16c0f21fe81ebddc89f9e3d54
|
3 |
+
size 950300404
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:725681141297b5a811da0b758ec213e735c4c814776467b260215491ead83f49
|
3 |
+
size 476021984
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:df1e0bbc490e06d26137ed343b073880f1c75e591e723f3437bfa8cd0ef7d15f
|
3 |
+
size 408748028
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8f1b11a0f4803c1ae5062ad25a2e02bd00e890bcbd8fb5d658f40830e708a395
|
3 |
+
size 315189311
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fe634136f59df125e6664fcb59a921f5fbf55f5599357c21458ddb4de19063fd
|
3 |
+
size 476022112
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c28bd2e2ed3f64dcbe97986f11e7130060710d49f9a19fdd8d579acd435b7cb1
|
3 |
+
size 352683280
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0a9390440e31ae4b03bae1dc2e6771951a258f8ae3189d68130e48560e4162d0
|
3 |
+
size 849524517
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6af576058fbc4cc7384e7de9ac8b57629a086b4b148bc4f3dcf5b135fc2353d0
|
3 |
+
size 425511083
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:bc0d37c520914a6cf267807672da6bcb6467528e3bd6354d48316c998e8b08fa
|
3 |
+
size 377151050
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:96b300a7460db087c86c6df1b49d087de46be5f5b80454b7843e42c8c52eb918
|
3 |
+
size 286758064
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d816734411bbc01a093d816718cf791daca19c0798ef11eb0e36cf492d8dc7f7
|
3 |
+
size 425511184
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:bc9e221f49a7e86a913dbc746ceda4b1dfaeed32713c333fe4e1973ede2cbf8f
|
3 |
+
size 305592840
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:17be3f0967cb26e3c1673c6ce3ddd6fe9f2faed7a932f44116d872d17d0e08bd
|
3 |
+
size 341928527
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3cb8c0cfabdf8addf0d933f2e95c2952f02620fbe71f9bc874acce84a061526a
|
3 |
+
size 324859080
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7baef9b8d81bc5338e49b7ae5df112949e842c9644993ad889bda02851b0cf1c
|
3 |
+
size 239692728
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b51ff5504947869d1cf00bc8860efa2d5088409dc629cae7660f3c383fbd6e5a
|
3 |
+
size 341928611
|