|
--- |
|
license: apache-2.0 |
|
library_name: peft |
|
tags: |
|
- trl |
|
- sft |
|
- generated_from_trainer |
|
datasets: |
|
- generator |
|
base_model: nvidia/OpenMath-Mistral-7B-v0.1-hf |
|
model-index: |
|
- name: OpenMath-Mistral-7B-v0.1-hf-dialogsum-test-flash-attention-2 |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# OpenMath-Mistral-7B-v0.1-hf-dialogsum-test-flash-attention-2 |
|
|
|
This model is a fine-tuned version of [nvidia/OpenMath-Mistral-7B-v0.1-hf](https://huggingface.co/nvidia/OpenMath-Mistral-7B-v0.1-hf) on the generator dataset. |
|
with dataset: |
|
|
|
# Load dataset from the hub |
|
huggingface_dataset_name = "neil-code/dialogsum-test" |
|
dataset = load_d#print(dataset) |
|
|
|
import torch |
|
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig |
|
from trl import setup_chat_format |
|
|
|
# Hugging Face model id |
|
model_id = "nvidia/OpenMath-Mistral-7B-v0.1-hf" |
|
|
|
# BitsAndBytesConfig int-4 config |
|
bnb_config = BitsAndBytesConfig( |
|
load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16 |
|
) |
|
|
|
# Load model and tokenizer |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_id, |
|
device_map="auto", |
|
attn_implementation="flash_attention_2", |
|
torch_dtype=torch.bfloat16, |
|
quantization_config=bnb_config |
|
) |
|
tokenizer = AutoTokenizer.from_pretrained(model_id,use_fast=True) |
|
tokenizer.padding_side = 'right' # to prevent warnings |
|
|
|
# We redefine the pad_token and pad_token_id with out of vocabulary token (unk_token) |
|
tokenizer.pad_token = tokenizer.unk_token |
|
tokenizer.pad_token_id = tokenizer.unk_token_id |
|
|
|
# # set chat template to OAI chatML, remove if you start from a fine-tuned model |
|
model, tokenizer = setup_chat_format(model, tokenizer) |
|
|
|
text = "What is the capital of India?" |
|
|
|
device = 'cuda' |
|
model_inputs = tokenizer(text, return_tensors="pt").to(model.device) |
|
|
|
generated_ids = model.generate(**model_inputs, temperature=0.1, top_k=1, top_p=1.0, repetition_penalty=1.4, min_new_tokens=16, max_new_tokens=128, do_sample=True) |
|
decoded = tokenizer.decode(generated_ids[0]) |
|
print(decoded) |
|
|
|
# MODEL GENERATION - AFTER TUNNING |
|
|
|
index=10 |
|
dataset = dataset_dialogsum_test |
|
TUNE_model = model |
|
prompt = dataset[index]['dialogue'] |
|
summary = dataset[index]['summary'] |
|
|
|
formatted_prompt = f"Instruct: Summarize the following conversation.\n{prompt}\nOutput:\n" |
|
res = gen_after_tunning(TUNE_model,formatted_prompt,1024,) |
|
#print(res[0]) |
|
output = res[0].split('Output:\n')[1] |
|
|
|
dash_line = '-'.join('' for x in range(100)) |
|
print(dash_line) |
|
print(f'INPUT PROMPT:\n{formatted_prompt}') |
|
print(dash_line) |
|
print(f'BASELINE HUMAN SUMMARY:\n{summary}\n') |
|
print(dash_line) |
|
print(f'MODEL GENERATION - AFTER THE TUNNING:\n{output}') |
|
|
|
--------------------------------------------------------------------------------------------------- |
|
# INPUT PROMPT: |
|
Instruct: Summarize the following conversation. |
|
#Person1#: Could you do me a favor? |
|
#Person2#: Sure. What is it? |
|
#Person1#: Could you run over to the store? We need a few things. |
|
#Person2#: All right. What do you want me to get? |
|
#Person1#: Well, could you pick up some sugar? |
|
#Person2#: Okay. How much? |
|
#Person1#: A small bag. I guess we also need a few oranges. |
|
#Person2#: How many? |
|
#Person1#: Oh, let's see. . . About six. |
|
#Person2#: Anything else? |
|
#Person1#: Yes. We're out of milk. |
|
#Person2#: Okay. How much do you want me to get? A gallon? |
|
#Person1#: No. I think a half gallon will be enough. |
|
#Person2#: Is that all? |
|
#Person1#: I think so. Have you got all that? |
|
#Person2#: Yes. That's small bag of sugar, four oranges, and a half gallon of milk. |
|
#Person1#: Do you have enough money? |
|
#Person2#: I think so. |
|
#Person1#: Thanks very much. I appreciate it. |
|
Output: |
|
|
|
--------------------------------------------------------------------------------------------------- |
|
|
|
# BASELINE HUMAN SUMMARY: |
|
|
|
#Person1# asks #Person2# to do a favor. #Person2# agrees and helps buy a small bag of sugar, six oranges, and a half-gallon of milk. |
|
|
|
--------------------------------------------------------------------------------------------------- |
|
# MODEL GENERATION - AFTER THE TUNNING: |
|
|
|
#Person1# asks #Person2# to run over to the store to pick up some sugar, oranges, and milk. #Person2# thinks he has got all that. |
|
|
|
### |
|
|
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 0.0002 |
|
- train_batch_size: 3 |
|
- eval_batch_size: 8 |
|
- seed: 42 |
|
- gradient_accumulation_steps: 2 |
|
- total_train_batch_size: 6 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: constant |
|
- lr_scheduler_warmup_ratio: 0.03 |
|
- num_epochs: 3 |
|
|
|
### Training results |
|
|
|
|
|
|
|
### Framework versions |
|
|
|
- PEFT 0.9.0 |
|
- Transformers 4.38.2 |
|
- Pytorch 2.1.0+cu121 |
|
- Datasets 2.18.0 |
|
- Tokenizers 0.15.2 |