fbnhnsl
/

Finetuned_deepseek-coder-1.3b-base_Solidity_Constructs

Text Generation

Model card Files Files and versions

fbnhnsl commited on Apr 19

Commit

d77260d

·

verified ·

1 Parent(s): 1e9d85d

Update README.md

Files changed (1) hide show

README.md +32 -2

README.md CHANGED Viewed

@@ -15,9 +15,39 @@ license: cc-by-4.0
 ---
 This is a fine-tuned deepseek-coder-1.3b-base model for automatic completion of Solidity code. The model was fine-tuned using the Parameter Efficient Fine-tuning (PEFT) method
-Quantized Low Rank Adaptation (QLoRA) and a Fill-in-the-Middle (FIM) transformed and Slither audited dataset.
-Example usage:
 ```python
 # Load the fine-tuned model
 from transformers import AutoTokenizer, AutoModelForCausalLM

 ---
 This is a fine-tuned deepseek-coder-1.3b-base model for automatic completion of Solidity code. The model was fine-tuned using the Parameter Efficient Fine-tuning (PEFT) method
+Quantized Low Rank Adaptation (QLoRA) and a Fill-in-the-Middle (FIM) transformed dataset consisting of Solidity constructs (functions, modifiers, mappings, etc.). The model has a maximum sequence length of 256 tokens.
+General Fine-tuning informations:
+- Epochs: 2
+- Optimizer: paged AdamW 8-bit
+- Batch size: 8
+- LoRA target modules: ["q_proj", "o_proj", "k_proj", "v_proj"]
+- Quantization type: normal float 4-bit
+- QLoRA compute type: brain float 16-bit
+- Total time: 1 hour 23 minutes
+Some of the Hyperparameters were determined using Hyperparameter optimization with Ray Tune. The corresponding result for the best trial were:
+- Learning rate: 0.00016
+- Weight decay: 0.0534
+- Warmup steps: 100
+- Gradient accumulation steps: 2
+- LoRA rank: 64
+- LoRA alpha: 64
+- LoRA dropout: 0.0934665
+The Fine-tuning results are:
+- Training loss: ~0.7
+- Validation loss: ~0.75
+The model was evaluated with the test split compared to the base model. The metrics were used: Perplexity, BLEU and METEOR. The Perplexity results are:
+- Perplexity Base Model: 12.08
+- Perplexity Fine-tuned Model: 2.19
+The following code shows an example of how to use the model:
 ```python
 # Load the fine-tuned model
 from transformers import AutoTokenizer, AutoModelForCausalLM