ybelkada
/

Mixtral-8x7B-Instruct-v0.1-bnb-4bit

Text Generation

Mixture of Experts

text-generation-inference

4-bit precision

Model card Files Files and versions

Mixtral-8x7B-Instruct-v0.1-bnb-4bit / README.md

ybelkada's picture

Librarian Bot: Add moe tag to model (#3)

25bbb90 verified almost 2 years ago

|

history blame contribute delete

935 Bytes

	---
	language:
	- en
	license: apache-2.0
	library_name: transformers
	tags:
	- mistral
	- mixtral
	- moe
	model_name: Mixtral 8X7B - bnb 4-bit
	inference: false
	model_type: mixtral
	pipeline_tag: text-generation
	quantized_by: ybelkada
	---

	# Mixtral 8x7B Instruct-v0.1 - `bitsandbytes` 4-bit

	This repository contains the bitsandbytes 4-bit quantized version of [`mistralai/Mixtral-8x7B-Instruct-v0.1`](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1). To use it, make sure to have the latest version of `bitsandbytes` and `transformers` installed from source:

	Loading this model as such: will directly load the quantized model in 4-bit precision.
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_id = "ybelkada/Mixtral-8x7B-Instruct-v0.1-bnb-4bit"
	model = AutoModelForCausalLM.from_pretrained(model_id)
	```
	Note you need a CUDA-compatible GPU device to run low-bit precision models with `bitsandbytes`