|
--- |
|
library_name: transformers |
|
license: mit |
|
datasets: |
|
- l3-unc/CausalDiagnosticity |
|
language: |
|
- en |
|
base_model: |
|
- Qwen/Qwen2.5-7B |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
This model is derived from **[Qwen/Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B)** and has been edited using **MEMIT** for the **`object_counting`** task from the [Causal Diagnosticity](https://huggingface.co/datasets/l3-unc/CausalDiagnosticity) dataset. |
|
|
|
# Versioning |
|
|
|
- **`_v1`** → The model is edited such that new knowledge is based on **`target_1`** from the `related_edits` field of each dataset item. |
|
- **`_v2`** → The model is edited such that new knowledge is based on **`target_2`** from the `related_edits` field of each dataset item. |
|
|
|
--- |
|
|
|
# MEMIT Hyperparameters |
|
|
|
```yaml |
|
alg_name: "MEMIT" |
|
layers: [4, 5, 6, 7, 8] |
|
clamp_norm_factor: 4 |
|
layer_selection: "all" |
|
fact_token: "subject_last" |
|
v_num_grad_steps: 25 |
|
v_lr: 5e-1 |
|
v_loss_layer: 27 |
|
v_weight_decay: 1e-3 |
|
kl_factor: 0.0625 |
|
mom2_adjustment: true |
|
mom2_update_weight: 15000 |
|
rewrite_module_tmp: "model.layers.{}.mlp.down_proj" |
|
layer_module_tmp: "model.layers.{}" |
|
mlp_module_tmp: "model.layers.{}.mlp" |
|
attn_module_tmp: "model.layers.{}.self_attn" |
|
ln_f_module: "model.norm" |
|
lm_head_module: "lm_head" |
|
mom2_dataset: "wikipedia" |
|
mom2_n_samples: 100000 |
|
mom2_dtype: "float32" |
|
model_parallel: False |
|
``` |
|
|
|
## Additional Resources |
|
|
|
For more information about the dataset, editing details, and the associated paper, see: |
|
|
|
- 📄 [Paper](https://arxiv.org/abs/2502.18848) |
|
- 📊 [Dataset](https://huggingface.co/datasets/l3-unc/CausalDiagnosticity) |
|
- 💻 [Code](https://github.com/KeremZaman/CausalDiagnosticity) |