aws-neuron
/

zephyr-7b-beta-neuron

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

ritikk commited on Jan 10, 2024

Commit

a687e5c

·

1 Parent(s): 0fd11b5

Added link to optimum neuron

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -32,9 +32,9 @@ prompt_template: '<|system|>
 ---
 # Neuronx model for Zephyr-7b-beta
-This repository contains [AWS Inferentia2](https://aws.amazon.com/ec2/instance-types/inf2/) and [`neuronx`](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/) compatible checkpoints for [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta).
-However, this file includes an example of how to compile various versions of Zephyr.  Support isn’t available yet (as of 1/9/2024) in the optimum-neuron framework, so we use the base transformers library.
 These instructions closely follow the [Developer Guide](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/transformers-neuronx/transformers-neuronx-developer-guide.html#grouped-query-attention-gqa-support-beta).  Look there for more detailed explanations, especially for the GQA settings.

 ---
 # Neuronx model for Zephyr-7b-beta
+This repository contains [AWS Inferentia2](https://aws.amazon.com/ec2/instance-types/inf2/) and [neuronx](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/) compatible checkpoints for [HuggingFaceH4/zephyr-7b-beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta).
+However, this file includes an example of how to compile various versions of Zephyr.  Support isn’t available yet (as of 1/9/2024) in the [optimum neuron](https://huggingface.co/docs/optimum-neuron/index) framework, so we use the base transformers library.
 These instructions closely follow the [Developer Guide](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/transformers-neuronx/transformers-neuronx-developer-guide.html#grouped-query-attention-gqa-support-beta).  Look there for more detailed explanations, especially for the GQA settings.