jfpio commited on
Commit
f913935
·
verified ·
1 Parent(s): 3bb8d99

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +16 -3
  2. config.json +1 -1
README.md CHANGED
@@ -15,6 +15,19 @@ datasets:
15
  metrics:
16
  - pass@1
17
  base_model: Qwen/Qwen2.5-Coder-7B-Instruct
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  ---
19
 
20
  # Wisent-Qwen2.5-Coder-7B-Instruct with CAA Steering
@@ -26,7 +39,7 @@ This is an enhanced version of Qwen2.5-Coder-7B-Instruct that integrates **Contr
26
  ### Key Features
27
 
28
  - 🚀 **Automatic CAA Steering**: No manual hook management required
29
- - 🎯 **Optimized Parameters**: Layer 24, α=0.9
30
  - 🗂️ **Trait-Based Organization**: Steering vectors organized by traits
31
  - 🔧 **Runtime Configurable**: Adjust or disable steering on the fly
32
  - 🤗 **HuggingFace Compatible**: Works with standard transformers API
@@ -131,7 +144,7 @@ To switch traits, simply update the configuration:
131
 
132
  - **Steering Method**: Contrastive Activation Addition (CAA)
133
  - **Optimal Layer**: 24 (out of 28 transformer layers)
134
- - **Steering Strength (α)**: 0.9
135
  - **Vector Format**: Safetensors format for efficient loading and HuggingFace compatibility
136
  - **Vector Dimension**: 3584 (pre-normalized during training)
137
  - **Storage Path**: `./vectors/mbpp_plus/steering_vector.safetensors`
@@ -151,7 +164,7 @@ The CAA parameters were optimized using:
151
  - **Framework**: Optuna with TPE sampler
152
  - **Search Space**: Layers 15-28, α ∈ [0.1, 5.0]
153
  - **Objective**: Maximize accuracy on MBPP Plus validation set
154
- - **Validation Results**: Optimized for improved performance on MBPP Plus tasks
155
 
156
  ## Model Architecture
157
 
 
15
  metrics:
16
  - pass@1
17
  base_model: Qwen/Qwen2.5-Coder-7B-Instruct
18
+ model-index:
19
+ - name: wisent-ai/qwen2.5-coder-7b-wisent-caa
20
+ results:
21
+ - task:
22
+ type: code-generation
23
+ name: Code Generation
24
+ dataset:
25
+ type: mbppplus
26
+ name: MBPP Plus
27
+ metrics:
28
+ - type: pass@1
29
+ value: 0.521
30
+ name: Pass@1
31
  ---
32
 
33
  # Wisent-Qwen2.5-Coder-7B-Instruct with CAA Steering
 
39
  ### Key Features
40
 
41
  - 🚀 **Automatic CAA Steering**: No manual hook management required
42
+ - 🎯 **Optimized Parameters**: Layer 24, α=1.4
43
  - 🗂️ **Trait-Based Organization**: Steering vectors organized by traits
44
  - 🔧 **Runtime Configurable**: Adjust or disable steering on the fly
45
  - 🤗 **HuggingFace Compatible**: Works with standard transformers API
 
144
 
145
  - **Steering Method**: Contrastive Activation Addition (CAA)
146
  - **Optimal Layer**: 24 (out of 28 transformer layers)
147
+ - **Steering Strength (α)**: 1.4
148
  - **Vector Format**: Safetensors format for efficient loading and HuggingFace compatibility
149
  - **Vector Dimension**: 3584 (pre-normalized during training)
150
  - **Storage Path**: `./vectors/mbpp_plus/steering_vector.safetensors`
 
164
  - **Framework**: Optuna with TPE sampler
165
  - **Search Space**: Layers 15-28, α ∈ [0.1, 5.0]
166
  - **Objective**: Maximize accuracy on MBPP Plus validation set
167
+ - **Best Performance**: 52.1% accuracy on MBPP Plus (378 problems)
168
 
169
  ## Model Architecture
170
 
config.json CHANGED
@@ -116,7 +116,7 @@
116
  },
117
  "caa_enabled": true,
118
  "caa_layer_id": 24,
119
- "caa_alpha": 0.9,
120
  "steering_method": "caa",
121
  "wisent_optimization": {
122
  "best_value": 0.64,
 
116
  },
117
  "caa_enabled": true,
118
  "caa_layer_id": 24,
119
+ "caa_alpha": 1.4,
120
  "steering_method": "caa",
121
  "wisent_optimization": {
122
  "best_value": 0.64,