mlx-community
/

mistral-7B-v0.1

Text Generation

Model card Files Files and versions

mistral-7B-v0.1 / README.md

reach-vb's picture

reach-vb HF Staff

Create README.md

226afd1 almost 2 years ago

|

551 Bytes

	# Mistral-7B-v0.1

	The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters.
	Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.

	For full details of this model please read our [paper](https://arxiv.org/abs/2310.06825) and [release blog post](https://mistral.ai/news/announcing-mistral-7b/).

	## Model Architecture

	Mistral-7B-v0.1 is a transformer model, with the following architecture choices:
	- Grouped-Query Attention
	- Sliding-Window Attention
	- Byte-fallback BPE tokenizer