Llama3-KALE-LM-Chem-8B / README.md

cyzhh

Update README.md

e299fff verified about 1 year ago

preview code

raw

history blame

3.32 kB

metadata

license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct

Llama3-KALE-LM-Chem 8B

Introduction

We are thrilled to present Llama3-KALE-LM-Chem 8B, the newest version of our Llama3-KALE-LM-Chem model, which embodies nearly half a year of innovation.

Training Details

We have continue pre-trained the model with a large amount of data and post-trained it using supervised fine-tuning.

Benchmarks

Open Benchmarks

Models	ChemBench	MMLU	MMLU-Chem	SciQ	IE(Acc)	IE(LS)
GPT-3.5	47.15	69.75	53.32	89.6	52.98	68.28
GPT-4	53.72	78.67	63.70	94.10	54.20	69.74
Llama3-8B-Instruct	46.02	68.3	51.10	93.30	45.83	61.22
LlaSMol	28.47	54.47	33.24	72.30	2.16	3.23
ChemDFM	44.44	58.11	45.60	86.70	7.61	11.49
ChemLLM-7B-Chat	34.16	61.79	48.39	94.00	29.66	39.17
ChemLLM-7B-Chat-1.5-SFT	42.75	63.56	49.63	95.10	14.96	19.61
KALE-LM	52.40	68.74	53.83	91.50	67.50	78.37
KALE-LM-INSTRUCT	57.01	68.09	54.83	91.60	57.53	64.16

In-House Benchmarks

Models	NC	PP	M2C	C2M	PP	Retro	YP	TP	SP	Average
GPT-3.5	46.93	56.98	85.28	38.25	43.67	42.33	30.33	42.57	38	47.15
GPT-4	54.82	65.02	92.64	52.88	62.67	52.67	42.33	24.75	35.67	53.72
Llama3-8B-Instruct	51.31	27.79	90.30	40.88	34.00	30.00	45.33	60.89	33.67	46.02
LlaSMol	27.78	29.34	31.44	23.38	25.67	24.00	37.33	34.65	22.67	28.47
ChemDFM	36.92	55.57	83.95	42.00	40.00	37.33	39.00	33.17	32.00	44.44
ChemLLM-7B-Chat	41.05	29.76	85.28	26.12	26.00	24.00	20.00	24.26	31.00	34.16
ChemLLM-7B-Chat-1.5-SFT	50.06	49.51	85.28	38.75	38.00	26.67	28.33	31.68	33.67	42.44
OURMODEL	63.58	58.39	92.98	44.50	48.67	38.33	46.33	44.55	34.33	52.41
OURMODELINSTRUCT	61.33	43.44	90.30	53.62	72.67	53.67	46.00	47.03	45.00	57.01

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained(
    "USTC-KnowledgeComputingLab/Llama3-KALE-LM-Chem-8B",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("USTC-KnowledgeComputingLab/Llama3-KALE-LM-Chem-8B")

prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)

generated_ids = model.generate(
    model_inputs.input_ids,
    max_new_tokens=2048
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

Citation

Will Coming soon....