| language: | |
| - en | |
| license: apache-2.0 | |
| library_name: transformers | |
| tags: | |
| - mistral | |
| - mixtral | |
| - moe | |
| model_name: Mixtral 8X7B - bnb 4-bit | |
| inference: false | |
| model_type: mixtral | |
| pipeline_tag: text-generation | |
| quantized_by: ybelkada | |
| # Mixtral 8x7B Instruct-v0.1 - `bitsandbytes` 4-bit | |
| This repository contains the bitsandbytes 4-bit quantized version of [`mistralai/Mixtral-8x7B-Instruct-v0.1`](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1). To use it, make sure to have the latest version of `bitsandbytes` and `transformers` installed from source: | |
| Loading this model as such: will directly load the quantized model in 4-bit precision. | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| model_id = "ybelkada/Mixtral-8x7B-Instruct-v0.1-bnb-4bit" | |
| model = AutoModelForCausalLM.from_pretrained(model_id) | |
| ``` | |
| Note you need a CUDA-compatible GPU device to run low-bit precision models with `bitsandbytes` |