Safetensors
GGUF
English
materials-science
computational-materials
materials-analysis
property-prediction
materials-discovery
crystal-structure
band-gap-prediction
thermal-properties
mechanical-properties
electronic-properties
materials-informatics
scientific-ai
chain-of-thought
structured-output
reasoning
materials-characterization
formation-energy-analysis
thermodynamic-stability
materials-optimization
predictive-modeling
materials-classification
structure-property-correlation
semiconductor-analysis
high-temperature-materials
power-electronics-materials
materials-screening
research-grade-analysis
industry-applications
MaterialsAnalyst-AI-7B
MaterialsAnalyst-AI
MaterialsAnalyst
File size: 1,735 Bytes
bd867ce 537c331 bd867ce |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
MaterialsAnalyst-AI-7B Training Documentation ================================================ Model Training Details --------------------- Base Model: Qwen 2.5 Instruct 7B Fine-tuning Method: LoRA (Low-Rank Adaptation) Training Infrastructure: Single NVIDIA A100 SXM4 GPU Training Duration: Approximately 5.4 hours Training Dataset: Custom curated dataset for materials analysis Dataset Specifications --------------------- Total Token Count: 6,441,671 Total Sample Count: 6,000 Average Tokens/Sample: 1,073.61 Dataset Creation: Generated using DeepSeekV3 API Training Configuration --------------------- LoRA Parameters: - Rank: 32 - Alpha: 64 - Dropout: 0.1 - Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, lm_head Training Hyperparameters: - Learning Rate: 5e-5 - Batch Size: 4 - Gradient Accumulation: 5 - Effective Batch Size: 20 - Max Sequence Length: 2048 - Epochs: 3 - Warmup Ratio: 0.01 - Weight Decay: 0.01 - Max Grad Norm: 1.0 - LR Scheduler: Cosine Hardware & Environment --------------------- GPU: NVIDIA A100 SXM4 (40GB) Operating System: Ubuntu CUDA Version: 11.8 PyTorch Version: 2.7.0 Compute Capability: 8.0 Optimization: FP16, Gradient Checkpointing Training Performance --------------------- Training Runtime: 5.37 hours (19,348 seconds) Train Samples/Second: 0.884 Train Steps/Second: 0.044 Training Loss (Final): 0.170 Validation Loss (Final): 0.136 Total Training Steps: 855 |