File size: 1,644 Bytes
5592cbb
 
0614549
 
 
 
 
 
 
5592cbb
 
 
 
0614549
5592cbb
0614549
5592cbb
0614549
 
5592cbb
0614549
5592cbb
0614549
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
---
library_name: transformers
license: mit
datasets:
- l3-unc/CausalDiagnosticity
language:
- en
base_model:
- Qwen/Qwen2.5-7B
---

# Model Card for Model ID

This model is derived from **[Qwen/Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B)** and has been edited using **MEMIT** for the **`object_counting`** task from the [Causal Diagnosticity](https://huggingface.co/datasets/l3-unc/CausalDiagnosticity) dataset.  

# Versioning  

- **`_v1`** → The model is edited such that new knowledge is based on **`target_1`** from the `related_edits` field of each dataset item.  
- **`_v2`** → The model is edited such that new knowledge is based on **`target_2`** from the `related_edits` field of each dataset item.  

---

# MEMIT Hyperparameters

```yaml
alg_name: "MEMIT"
layers: [4, 5, 6, 7, 8]
clamp_norm_factor: 4
layer_selection: "all"
fact_token: "subject_last"
v_num_grad_steps: 25
v_lr: 5e-1
v_loss_layer: 27
v_weight_decay: 1e-3
kl_factor: 0.0625
mom2_adjustment: true
mom2_update_weight: 15000
rewrite_module_tmp: "model.layers.{}.mlp.down_proj"
layer_module_tmp: "model.layers.{}"
mlp_module_tmp: "model.layers.{}.mlp"
attn_module_tmp: "model.layers.{}.self_attn"
ln_f_module: "model.norm"
lm_head_module: "lm_head"
mom2_dataset: "wikipedia"
mom2_n_samples: 100000
mom2_dtype: "float32"
model_parallel: False
```

## Additional Resources

For more information about the dataset, editing details, and the associated paper, see:  

- 📄 [Paper](https://arxiv.org/abs/2502.18848)  
- 📊 [Dataset](https://huggingface.co/datasets/l3-unc/CausalDiagnosticity)  
- 💻 [Code](https://github.com/KeremZaman/CausalDiagnosticity)