MediLlama-3.2 / README.md

InferenceLab

Update README.md

d7233c5 verified 5 months ago

4.65 kB

	---
	library_name: transformers
	tags:
	- unsloth
	- trl
	- sft
	license: apache-2.0
	language:
	- en
	base_model:
	- meta-llama/Llama-3.2-3B-Instruct
	pipeline_tag: text-generation
	metrics:
	- accuracy
	- bleu
	- rouge
	---

	# Model Card for MediLlama-3.2

	A fine-tuned version of Meta's LLaMA 3.2 (3B Instruct) for domain-specific applications in healthcare and medicine. This model is optimized for tasks such as medical Q&A, symptom checking, and patient education.

	## Model Details

	### Model Description

	This model is a domain-adapted version of LLaMA 3.2 3B Instruct. It has been fine-tuned using supervised fine-tuning (SFT) on medical datasets to handle English-language healthcare scenarios including diagnostic queries, treatment suggestions, and general medical advice.

	- Developed by: InferenceLab
	- Model type: Medical Chatbot
	- Language(s) (NLP): English
	- License: Apache 2.0
	- Finetuned from model: meta-llama/Llama-3.2-3B-Instruct

	## Uses

	### Direct Use

	MediLlama-3.2 can be used directly as a chatbot or virtual assistant in medical and health-related applications. Ideal for educational content, initial symptom triage, and research purposes.

	### Downstream Use

	Can be integrated into larger telehealth systems, clinical documentation tools, or diagnostic assistants after further task-specific fine-tuning.

	### Out-of-Scope Use

	- Should not be used for real-time diagnosis or treatment decisions without expert validation.
	- Not suitable for high-risk or life-threatening emergency response.
	- Not trained on pediatric or highly specialized medical domains.

	## Bias, Risks, and Limitations

	While the model is trained on medical data, it may still exhibit:
	- Biases from source data
	- Hallucinations or incorrect suggestions
	- Outdated or non-region-specific medical advice

	### Recommendations

	Users should validate outputs with certified medical professionals. This model is for research and prototyping only, not for clinical deployment without regulatory compliance.

	## How to Get Started with the Model

	```python
	import torch
	from transformers import pipeline

	model_id = "InferenceLab/MediLlama-3.2"
	pipe = pipeline(
	"text-generation",
	model=model_id,
	torch_dtype=torch.bfloat16,
	device_map="auto",
	)

	messages = [
	{"role": "system", "content": "You are a helpful Medical assistant."},
	{"role": "user", "content": "Hi! How are you?"},
	]
	outputs = pipe(
	messages,
	max_new_tokens=256,
	)
	print(outputs[0]["generated_text"][-1])

	````

	## Training Details

	### Training Data

	Model trained using cleaned and preprocessed medical QA datasets, synthetic doctor-patient conversations, and publicly available health forums. Protected health information (PHI) was removed.

	### Training Procedure

	Supervised fine-tuning (SFT) using TRL and Unsloth libraries.

	#### Preprocessing

	Tokenization using LLaMA tokenizer with special medical instruction formatting.

	#### Training Hyperparameters

	* Training regime: bf16 mixed precision
	* Learning rate: 1e-5

	#### Speeds, Sizes, Times

	* Training time: \~12 hours on 4×A100 GPUs

	## Evaluation

	### Testing Data, Factors & Metrics

	#### Testing Data

	Subset of unseen medical QA pairs, synthetic test cases, and MedQA-derived examples.

	#### Factors

	* Input prompt complexity
	* Use of medical terminology
	* Chat length

	#### Metrics

	* Accuracy: 81.3%
	* BLEU: 34.5
	* ROUGE-L: 62.2

	### Results

	#### Summary

	Model shows good generalization to unseen prompts and performs competitively for general medical dialogue. Further tuning needed for specialty areas like oncology or rare diseases.

	## Model Examination

	Explainability tools like LLaMA-MedLens (if available) are suggested to interpret model decisions.

	## Environmental Impact

	* Hardware Type: 4×NVIDIA A100 40GB
	* Hours used: 12
	* Cloud Provider: AWS
	* Compute Region: us-west-2
	* Carbon Emitted: \~35.8 kg CO2eq (estimated)

	## Technical Specifications

	### Model Architecture and Objective

	* Based on Meta LLaMA 3.2 3B Instruct
	* Decoder-only transformer
	* Objective: Causal Language Modeling (CLM) with instruction fine-tuning

	### Compute Infrastructure

	#### Hardware

	* 4×NVIDIA A100 40GB

	#### Software

	* Python 3.10
	* Transformers (v4.40+)
	* TRL
	* Unsloth
	* PyTorch 2.1


	## Glossary

	* SFT: Supervised Fine-Tuning
	* BLEU: Bilingual Evaluation Understudy
	* ROUGE: Recall-Oriented Understudy for Gisting Evaluation

	## More Information

	For collaborations, deployment help, or fine-tuning extensions, please contact the developers.

	## Model Card Authors

	* InferenceLab Team