smcleish
/

clrs_llama_3_8b_100k_finetune_with_traces

Text Generation

text-generation-inference

Model card Files Files and versions

clrs_llama_3_8b_100k_finetune_with_traces / README.md

smcleish's picture

Update README.md

70c0c5a verified 8 months ago

|

history blame contribute delete

436 Bytes

	---
	library_name: transformers
	license: mit
	base_model:
	- meta-llama/Meta-Llama-3-8B
	---

	# Model Details

	`meta-llama/Meta-Llama-3-8B` model finetuned on 100,000 [CLRS-Text](https://github.com/google-deepmind/clrs/tree/master/clrs/_src/clrs_text) examples.

	## Training Details
	- Learning Rate: 1e-4, 150 warmup steps then cosine decayed to 5e-06 using AdamW optimiser
	- Batch size: 128
	- Loss taken over answer only, not on question.