|
--- |
|
license: other |
|
language: |
|
- en |
|
tags: |
|
- causal-lm |
|
- code |
|
metrics: |
|
- code_eval |
|
library_name: transformers |
|
model-index: |
|
- name: stabilityai/stable-code-instruct-3b |
|
results: |
|
- task: |
|
type: text-generation |
|
dataset: |
|
type: nuprl/MultiPL-E |
|
name: MultiPL-HumanEval (Python) |
|
metrics: |
|
- name: pass@1 |
|
type: pass@1 |
|
value: 32.4 |
|
verified: false |
|
- task: |
|
type: text-generation |
|
dataset: |
|
type: nuprl/MultiPL-E |
|
name: MultiPL-HumanEval (C++) |
|
metrics: |
|
- name: pass@1 |
|
type: pass@1 |
|
value: 30.9 |
|
verified: false |
|
- task: |
|
type: text-generation |
|
dataset: |
|
type: nuprl/MultiPL-E |
|
name: MultiPL-HumanEval (Java) |
|
metrics: |
|
- name: pass@1 |
|
type: pass@1 |
|
value: 32.1 |
|
verified: false |
|
- task: |
|
type: text-generation |
|
dataset: |
|
type: nuprl/MultiPL-E |
|
name: MultiPL-HumanEval (JavaScript) |
|
metrics: |
|
- name: pass@1 |
|
type: pass@1 |
|
value: 32.1 |
|
verified: false |
|
- task: |
|
type: text-generation |
|
dataset: |
|
type: nuprl/MultiPL-E |
|
name: MultiPL-HumanEval (PHP) |
|
metrics: |
|
- name: pass@1 |
|
type: pass@1 |
|
value: 24.2 |
|
verified: false |
|
- task: |
|
type: text-generation |
|
dataset: |
|
type: nuprl/MultiPL-E |
|
name: MultiPL-HumanEval (Rust) |
|
metrics: |
|
- name: pass@1 |
|
type: pass@1 |
|
value: 23.0 |
|
verified: false |
|
--- |
|
# `stable-code-instruct-3b` |
|
|
|
## Model Description |
|
|
|
`stable-code-instruct-3b` is a 2.7B billion parameter decoder-only language model tuned from [`stable-code-3b`](https://huggingface.co/stabilityai/stable-code-3b/). This model was trained on a mix of publicly available datasets, synthetic datasets using [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290). |
|
|
|
This instruct tune demonstrates state-of-the-art performance (compared to models of similar size) on the MultiPL-E metrics across multiple programming languages tested using [BigCode's Evaluation Harness](https://github.com/bigcode-project/bigcode-evaluation-harness/tree/main), and on the code portions of |
|
[MT Bench](https://klu.ai/glossary/mt-bench-eval) |
|
|
|
|
|
## Usage |
|
Here's how you can run the model use the model: |
|
|
|
```python |
|
# pip install -U transformers |
|
# pip install accelerate |
|
|
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
tokenizer = AutoTokenizer.from_pretrained("stabilityai/stable-code-instruct-3b", trust_remote_code=True) |
|
model = AutoModelForCausalLM.from_pretrained("stabilityai/stable-code-instruct-3b", torch_dtype=torch.bfloat16, trust_remote_code=True) |
|
model.eval() |
|
model = model.cuda() |
|
|
|
messages = [ |
|
{ |
|
"role": "system", |
|
"content": "You are a helpful and polite assistant", |
|
}, |
|
{ |
|
"role": "user", |
|
"content": "Write a simple website in HTML. When a user clicks the button, it shows a random joke from a list of 4 jokes." |
|
}, |
|
] |
|
|
|
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False) |
|
|
|
inputs = tokenizer([prompt], return_tensors="pt").to(model.device) |
|
|
|
tokens = model.generate( |
|
**inputs, |
|
max_new_tokens=1024, |
|
temperature=0.5, |
|
top_p=0.95, |
|
top_k=100, |
|
do_sample=True, |
|
use_cache=True |
|
) |
|
|
|
output = tokenizer.batch_decode(tokens[:, inputs.input_ids.shape[-1]:], skip_special_tokens=False)[0] |
|
``` |
|
|
|
|
|
## How to Cite |
|
|
|
```bibtex |
|
@misc{stable-code-instruct-3b, |
|
url={[https://huggingface.co/stabilityai/stable-code-3b](https://huggingface.co/stabilityai/stable-code-instruct-3b)}, |
|
title={Stable Code 3B}, |
|
author={Phung, Duy, and Pinnaparaju, Nikhil and Adithyan, Reshinth and Tow, Jonathan and Cooper, Nathan} |
|
} |
|
``` |