Research-Reasoner-7B-v0.3 / Training /Training_Documentation.txt

Update Training/Training_Documentation.txt

6aded0c verified 4 months ago

1.75 kB

	Research-Reasoner-7B-v0.3 Training Documentation
	===================================================

	Model Training Details
	---------------------

	Base Model: Mistral 7B Instruct v0.3
	Fine-tuning Method: LoRA (Low-Rank Adaptation)
	Training Infrastructure: Single NVIDIA A100 PCIe GPU
	Training Duration: Approximately 3.8 hours
	Training Dataset: Custom curated dataset for research planning

	Dataset Specifications
	---------------------

	Total Token Count: 5,840,200
	Total Sample Count: 5,750
	Average Tokens/Sample: 1,015.69
	Dataset Creation: Generated using DeepSeek-V3 API

	Training Configuration
	---------------------

	LoRA Parameters:
	- Rank: 32
	- Alpha: 64
	- Dropout: 0.1
	- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, lm_head

	Training Hyperparameters:
	- Learning Rate: 5e-5
	- Batch Size: 4
	- Gradient Accumulation: 5
	- Effective Batch Size: 20
	- Max Sequence Length: 2048
	- Epochs: 3
	- Warmup Ratio: 0.01
	- Weight Decay: 0.01
	- Max Grad Norm: 1.0
	- LR Scheduler: Cosine

	Hardware & Environment
	---------------------

	GPU: NVIDIA A100 PCIe (40GB)
	Operating System: Ubuntu
	CUDA Version: 11.8
	PyTorch Version: 2.7.0
	Compute Capability: 8.0
	Optimization: FP16, Gradient Checkpointing

	Training Performance
	---------------------

	Training Runtime: 3.87 hours (13,936 seconds)
	Train Samples/Second: 1.176
	Train Steps/Second: 0.059
	Training Loss (Final): 0.137
	Validation Loss (Final): 0.230
	Total Training Steps: 822