Upload folder using huggingface_hub
Browse files- README.md +16 -3
- config.json +1 -1
README.md
CHANGED
@@ -15,6 +15,19 @@ datasets:
|
|
15 |
metrics:
|
16 |
- pass@1
|
17 |
base_model: Qwen/Qwen2.5-Coder-7B-Instruct
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
---
|
19 |
|
20 |
# Wisent-Qwen2.5-Coder-7B-Instruct with CAA Steering
|
@@ -26,7 +39,7 @@ This is an enhanced version of Qwen2.5-Coder-7B-Instruct that integrates **Contr
|
|
26 |
### Key Features
|
27 |
|
28 |
- 🚀 **Automatic CAA Steering**: No manual hook management required
|
29 |
-
- 🎯 **Optimized Parameters**: Layer 24, α=
|
30 |
- 🗂️ **Trait-Based Organization**: Steering vectors organized by traits
|
31 |
- 🔧 **Runtime Configurable**: Adjust or disable steering on the fly
|
32 |
- 🤗 **HuggingFace Compatible**: Works with standard transformers API
|
@@ -131,7 +144,7 @@ To switch traits, simply update the configuration:
|
|
131 |
|
132 |
- **Steering Method**: Contrastive Activation Addition (CAA)
|
133 |
- **Optimal Layer**: 24 (out of 28 transformer layers)
|
134 |
-
- **Steering Strength (α)**:
|
135 |
- **Vector Format**: Safetensors format for efficient loading and HuggingFace compatibility
|
136 |
- **Vector Dimension**: 3584 (pre-normalized during training)
|
137 |
- **Storage Path**: `./vectors/mbpp_plus/steering_vector.safetensors`
|
@@ -151,7 +164,7 @@ The CAA parameters were optimized using:
|
|
151 |
- **Framework**: Optuna with TPE sampler
|
152 |
- **Search Space**: Layers 15-28, α ∈ [0.1, 5.0]
|
153 |
- **Objective**: Maximize accuracy on MBPP Plus validation set
|
154 |
-
- **
|
155 |
|
156 |
## Model Architecture
|
157 |
|
|
|
15 |
metrics:
|
16 |
- pass@1
|
17 |
base_model: Qwen/Qwen2.5-Coder-7B-Instruct
|
18 |
+
model-index:
|
19 |
+
- name: wisent-ai/qwen2.5-coder-7b-wisent-caa
|
20 |
+
results:
|
21 |
+
- task:
|
22 |
+
type: code-generation
|
23 |
+
name: Code Generation
|
24 |
+
dataset:
|
25 |
+
type: mbppplus
|
26 |
+
name: MBPP Plus
|
27 |
+
metrics:
|
28 |
+
- type: pass@1
|
29 |
+
value: 0.521
|
30 |
+
name: Pass@1
|
31 |
---
|
32 |
|
33 |
# Wisent-Qwen2.5-Coder-7B-Instruct with CAA Steering
|
|
|
39 |
### Key Features
|
40 |
|
41 |
- 🚀 **Automatic CAA Steering**: No manual hook management required
|
42 |
+
- 🎯 **Optimized Parameters**: Layer 24, α=1.4
|
43 |
- 🗂️ **Trait-Based Organization**: Steering vectors organized by traits
|
44 |
- 🔧 **Runtime Configurable**: Adjust or disable steering on the fly
|
45 |
- 🤗 **HuggingFace Compatible**: Works with standard transformers API
|
|
|
144 |
|
145 |
- **Steering Method**: Contrastive Activation Addition (CAA)
|
146 |
- **Optimal Layer**: 24 (out of 28 transformer layers)
|
147 |
+
- **Steering Strength (α)**: 1.4
|
148 |
- **Vector Format**: Safetensors format for efficient loading and HuggingFace compatibility
|
149 |
- **Vector Dimension**: 3584 (pre-normalized during training)
|
150 |
- **Storage Path**: `./vectors/mbpp_plus/steering_vector.safetensors`
|
|
|
164 |
- **Framework**: Optuna with TPE sampler
|
165 |
- **Search Space**: Layers 15-28, α ∈ [0.1, 5.0]
|
166 |
- **Objective**: Maximize accuracy on MBPP Plus validation set
|
167 |
+
- **Best Performance**: 52.1% accuracy on MBPP Plus (378 problems)
|
168 |
|
169 |
## Model Architecture
|
170 |
|
config.json
CHANGED
@@ -116,7 +116,7 @@
|
|
116 |
},
|
117 |
"caa_enabled": true,
|
118 |
"caa_layer_id": 24,
|
119 |
-
"caa_alpha":
|
120 |
"steering_method": "caa",
|
121 |
"wisent_optimization": {
|
122 |
"best_value": 0.64,
|
|
|
116 |
},
|
117 |
"caa_enabled": true,
|
118 |
"caa_layer_id": 24,
|
119 |
+
"caa_alpha": 1.4,
|
120 |
"steering_method": "caa",
|
121 |
"wisent_optimization": {
|
122 |
"best_value": 0.64,
|