yalhessi's picture
End of training
f6d8ce8 verified
|
raw
history blame
3.18 kB
metadata
library_name: peft
license: other
base_model: deepseek-ai/deepseek-coder-1.3b-base
tags:
  - generated_from_trainer
model-index:
  - name: lemexp-task1-min_symbols_template_small-deepseek-coder-1.3b-base-ddp
    results: []

lemexp-task1-min_symbols_template_small-deepseek-coder-1.3b-base-ddp

This model is a fine-tuned version of deepseek-ai/deepseek-coder-1.3b-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1887

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 16
  • total_eval_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 6
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.4557 0.2001 629 0.3633
0.3618 0.4001 1258 0.3296
0.3388 0.6002 1887 0.3132
0.3131 0.8003 2516 0.2975
0.3022 1.0003 3145 0.2878
0.2884 1.2004 3774 0.2849
0.2806 1.4004 4403 0.2791
0.2695 1.6005 5032 0.2651
0.2684 1.8006 5661 0.2560
0.261 2.0006 6290 0.2564
0.2544 2.2007 6919 0.2513
0.2437 2.4008 7548 0.2441
0.2393 2.6008 8177 0.2406
0.2375 2.8009 8806 0.2338
0.2326 3.0010 9435 0.2257
0.2124 3.2010 10064 0.2227
0.2137 3.4011 10693 0.2215
0.2102 3.6011 11322 0.2127
0.2079 3.8012 11951 0.2103
0.2034 4.0013 12580 0.2070
0.1862 4.2013 13209 0.2049
0.1831 4.4014 13838 0.2029
0.185 4.6015 14467 0.1987
0.1754 4.8015 15096 0.1975
0.1753 5.0016 15725 0.1937
0.1622 5.2017 16354 0.1959
0.155 5.4017 16983 0.1912
0.1501 5.6018 17612 0.1897
0.1481 5.8018 18241 0.1887

Framework versions

  • PEFT 0.14.0
  • Transformers 4.47.0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0