Update README.md
Browse files
README.md
CHANGED
|
@@ -15,9 +15,39 @@ license: cc-by-4.0
|
|
| 15 |
---
|
| 16 |
|
| 17 |
This is a fine-tuned deepseek-coder-1.3b-base model for automatic completion of Solidity code. The model was fine-tuned using the Parameter Efficient Fine-tuning (PEFT) method
|
| 18 |
-
Quantized Low Rank Adaptation (QLoRA) and a Fill-in-the-Middle (FIM) transformed
|
| 19 |
|
| 20 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
```python
|
| 22 |
# Load the fine-tuned model
|
| 23 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
|
|
|
| 15 |
---
|
| 16 |
|
| 17 |
This is a fine-tuned deepseek-coder-1.3b-base model for automatic completion of Solidity code. The model was fine-tuned using the Parameter Efficient Fine-tuning (PEFT) method
|
| 18 |
+
Quantized Low Rank Adaptation (QLoRA) and a Fill-in-the-Middle (FIM) transformed dataset consisting of Solidity constructs (functions, modifiers, mappings, etc.). The model has a maximum sequence length of 256 tokens.
|
| 19 |
|
| 20 |
+
General Fine-tuning informations:
|
| 21 |
+
|
| 22 |
+
- Epochs: 2
|
| 23 |
+
- Optimizer: paged AdamW 8-bit
|
| 24 |
+
- Batch size: 8
|
| 25 |
+
- LoRA target modules: ["q_proj", "o_proj", "k_proj", "v_proj"]
|
| 26 |
+
- Quantization type: normal float 4-bit
|
| 27 |
+
- QLoRA compute type: brain float 16-bit
|
| 28 |
+
- Total time: 1 hour 23 minutes
|
| 29 |
+
|
| 30 |
+
Some of the Hyperparameters were determined using Hyperparameter optimization with Ray Tune. The corresponding result for the best trial were:
|
| 31 |
+
|
| 32 |
+
- Learning rate: 0.00016
|
| 33 |
+
- Weight decay: 0.0534
|
| 34 |
+
- Warmup steps: 100
|
| 35 |
+
- Gradient accumulation steps: 2
|
| 36 |
+
- LoRA rank: 64
|
| 37 |
+
- LoRA alpha: 64
|
| 38 |
+
- LoRA dropout: 0.0934665
|
| 39 |
+
|
| 40 |
+
The Fine-tuning results are:
|
| 41 |
+
|
| 42 |
+
- Training loss: ~0.7
|
| 43 |
+
- Validation loss: ~0.75
|
| 44 |
+
|
| 45 |
+
The model was evaluated with the test split compared to the base model. The metrics were used: Perplexity, BLEU and METEOR. The Perplexity results are:
|
| 46 |
+
|
| 47 |
+
- Perplexity Base Model: 12.08
|
| 48 |
+
- Perplexity Fine-tuned Model: 2.19
|
| 49 |
+
|
| 50 |
+
The following code shows an example of how to use the model:
|
| 51 |
```python
|
| 52 |
# Load the fine-tuned model
|
| 53 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|