v0.32.0
Browse filesSee https://github.com/quic/ai-hub-models/releases/v0.32.0 for changelog.
LICENSE
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
The license of the original trained model can be found at https://github.com/facebookresearch/llama/blob/main/LICENSE.
|
2 |
+
The license for the deployable model files (.tflite, .onnx, .dlc, .bin, etc.) can be found at https://github.com/facebookresearch/llama/blob/main/LICENSE.
|
README.md
CHANGED
@@ -28,14 +28,11 @@ This model is an implementation of Llama-v2-7B-Chat found [here](https://hugging
|
|
28 |
- **Model Stats:**
|
29 |
- Input sequence length for Prompt Processor: 1024
|
30 |
- Context length: 1024
|
31 |
-
- Number of parameters: 7B
|
32 |
- Precision: w4a16 + w8a16 (few layers)
|
33 |
- Model-1 (Prompt Processor): Llama-PromptProcessor-Quantized
|
34 |
-
- Prompt processor model size: 3.6 GB
|
35 |
- Prompt processor input: 1024 tokens
|
36 |
- Prompt processor output: 1024 output tokens + KVCache for token generator
|
37 |
- Model-2 (Token Generator): Llama-TokenGenerator-KVCache-Quantized
|
38 |
-
- Token generator model size: 3.6 GB
|
39 |
- Token generator input: 1 input token + past KVCache
|
40 |
- Token generator output: 1 output token + KVCache for next iteration
|
41 |
- Use: Initiate conversation with prompt-processor and then token generator for subsequent iterations.
|
|
|
28 |
- **Model Stats:**
|
29 |
- Input sequence length for Prompt Processor: 1024
|
30 |
- Context length: 1024
|
|
|
31 |
- Precision: w4a16 + w8a16 (few layers)
|
32 |
- Model-1 (Prompt Processor): Llama-PromptProcessor-Quantized
|
|
|
33 |
- Prompt processor input: 1024 tokens
|
34 |
- Prompt processor output: 1024 output tokens + KVCache for token generator
|
35 |
- Model-2 (Token Generator): Llama-TokenGenerator-KVCache-Quantized
|
|
|
36 |
- Token generator input: 1 input token + past KVCache
|
37 |
- Token generator output: 1 output token + KVCache for next iteration
|
38 |
- Use: Initiate conversation with prompt-processor and then token generator for subsequent iterations.
|